Hostname: page-component-6bf8c574d5-vmclg Total loading time: 0 Render date: 2025-02-21T00:12:28.262Z Has data issue: false hasContentIssue false

The First Principal Component of Multifaceted Variables: It's More Than a G Thing

Published online by Cambridge University Press:  02 October 2015

Duncan J. R. Jackson*
Affiliation:
Department of Organizational Psychology, Birkbeck, University of London, and Faculty of Management, University of Johannesburg
Dan J. Putka
Affiliation:
Human Resources Research Organization, Alexandria, Virginia
Kevin R. H. Teoh
Affiliation:
Department of Organizational Psychology, Birkbeck, University of London
*
Correspondence concerning this article should be addressed to Duncan J. R. Jackson, Department of Organizational Psychology, Birkbeck, University of London, Clore Management Centre, Torrington Square, London, United KingdomWC1E 7JL. E-mail: dj.jackson@bbk.ac.uk
Rights & Permissions [Opens in a new window]

Extract

Ree, Carretta, and Teachout (2015) raise the need for further investigation into dominant general factors (DGFs) and their prevalence in measures used for the purposes of employee selection, development, and performance measurement. They imply that a method of choice for estimating the contribution of DGFs is principal components analysis (PCA), and they interpret the variance accounted for by the first component of the PCA solution as indicative of the contribution of a general factor. In this response, we illustrate the hazard of equating the first component of a PCA with a general factor, and we illustrate how this becomes particularly problematic when applying PCA to multifaceted variables. Rather than simply critique this use of PCA, we offer an alternative approach that helps to address and illustrate the problem that we raise.

Type
Commentaries
Copyright
Copyright © Society for Industrial and Organizational Psychology 2015 

Ree, Carretta, and Teachout (Reference Ree, Carretta and Teachout2015) raise the need for further investigation into dominant general factors (DGFs) and their prevalence in measures used for the purposes of employee selection, development, and performance measurement. They imply that a method of choice for estimating the contribution of DGFs is principal components analysis (PCA), and they interpret the variance accounted for by the first component of the PCA solution as indicative of the contribution of a general factor. In this response, we illustrate the hazard of equating the first component of a PCA with a general factor, and we illustrate how this becomes particularly problematic when applying PCA to multifaceted variables. Rather than simply critique this use of PCA, we offer an alternative approach that helps to address and illustrate the problem that we raise.

Partitioning Variance in Multifaceted Variables

For starters, consider item-level data from two types of measures mentioned by Ree et al. (i.e., cognitive ability tests and assessment centers). When decomposing variance in a simple cognitive ability test with PCA (e.g., a basic test of mathematical reasoning), the variables being analyzed typically differ along a single facet of measurement, namely, items (Cronbach, Gleser, Nanda, & Rajaratnam, Reference Cronbach, Gleser, Nanda and Rajaratnam1972). In contrast, when decomposing variance in item-level data from other, more complex types of measures, such as assessment centers (ACs), interviews, job performance ratings, multisource feedback instruments, or situational judgment tests, the variables being analyzed are often multifaceted in nature (Putka & Sackett, Reference Putka, Sackett, Farr and Tippins2010). For example, in the case of AC data, within-exercise dimension ratings (WEDRs, i.e., ratings on a given dimension within a given exercise) have been framed as item-level variables that reflect combinations of a given dimension (e.g., interpersonal skill) and given exercise (e.g., role play; Kuncel & Sackett, Reference Kuncel and Sackett2014). As another example, variables in a multisource rating data set might reflect combinations of a given rating dimension and given rating source (e.g., peers, subordinates, supervisors; Scullen, Mount, & Goff, Reference Scullen, Mount and Goff2000). As we demonstrate below, applying PCA to multifaceted variables can give a misleading picture of the percentage of variance that is attributable to a general factor because it ignores the multifaceted nature of the variance underlying the first principal component.

Conceptually, the problem with using PCA to estimate the magnitude of a general factor among multifaceted variables is that variance associated with the first principal component fails to distinguish between what Ree et al. describe as a general factor and group factors. One of the primary distinctions between general and group factors is that a general factor influences each observed variable, whereas group factors do not influence all observed variables—only those that are part of the group in question. In the context of the WEDR-level variables common in AC data sets, we know that there will be group factors that manifest in the covariation among WEDRs that share a given dimension and group factors that manifest in the covariation among WEDRs that share a given exercise (e.g., Arthur, Woehr, & Maldegan, Reference Arthur, Woehr and Maldegan2000; Bowler & Woehr, Reference Bowler and Woehr2006; Putka & Hoffman, Reference Putka and Hoffman2013). More generally, group factors will manifest in the covariation among items that share a level of a measurement facet in common (e.g., a specific dimension, specific exercise, specific rating source).

When applying PCA to decompose variance in a set of variables, researchers have long recognized that the variance associated with the first principal component may reflect not only a general source of variance but also group-specific sources of variance and error (e.g., Fabrigar, Wegener, MacCallum, & Strahan, Reference Fabrigar, Wegener, MacCallum and Strahan1999; Floyd, Shands, Rafael, Bergeron, & McGrew, Reference Floyd, Shands, Rafael, Bergeron and McGrew2009; Widaman, Reference Widaman, Cudeck and MacCallum2007). This happens because PCA is not model-based and makes no attempt to distinguish distinct sources of variance underlying any given component. It simply attempts to maximally reproduce observed variance regardless of whether that variance is common (e.g., general, group-specific) or unique (e.g., error). Simply put, a PCA's first component reflects a linear combination of observed variables, each of which reflect multiple sources of variance (Fabrigar et al., Reference Fabrigar, Wegener, MacCallum and Strahan1999). This means that the resulting linear combination of those variables will also be a function of multiple sources of variance, only one of which may be a general factor. As such, it is hazardous to equate variance associated with the first principal component as solely reflecting variance attributable to a general factor. The first principal component and a general factor are two different things. Failure to make this distinction can lead to unwarranted conclusions regarding the magnitude of variance attributable to a general factor based on PCA. As we discuss next, this can be particularly troublesome when analyzing multifaceted variables using PCA, where group-specific variance associated with measurement facets may be particularly strong (e.g., Hoffman, Lance, Bynum, & Gentry, Reference Hoffman, Lance, Bynum and Gentry2010; Lance, Reference Lance2008).

An Alternative to PCA

Given the observations above, an alternative to PCA would be to use a model that is sensitive to the multifaceted variables being analyzed and that allows for better differentiation between general and group-specific variance. The random effects model underlying applications of generalizability theory provides such a model and facilitates providing an empirical illustration of the limitation of PCA outlined above (Cronbach et al., Reference Cronbach, Gleser, Nanda and Rajaratnam1972).

From the perspective of generalizability theory, the closest analogues to the effects of a “general factor” are person main effects (Cronbach et al., Reference Cronbach, Gleser, Nanda and Rajaratnam1972; Shavelson & Webb, Reference Shavelson, Webb, Green, Camilli and Elmore2005). Person main effects imply that some people generally perform better than others do on the measures of interest, regardless of any particular dimension, exercise, rating source, or other design features involved—by definition, they reflect an effect based on all observed variables. To properly estimate variance attributable to person main effects, the random effects model used to decompose observed variance must appropriately reflect the multifaceted nature of the variables being analyzed. For example, in the context of AC data, if there is variance in observed assessment scores specific to a given dimension or exercise, then terms corresponding to dimension- and exercise-related effects should be included in the model used to decompose variance. Failure to include such terms will lead to artificially inflated estimates of variance attributable to person main effects, as covariation among indicators attributable to those group effects will be partially manifest in person main effect variance. In contrast, including such terms would do little harm if dimension- or exercise-related effects happened to be weak or nonexistent. The problem that the model misspecification described above creates for estimating variance attributable to person main effects is analogous to the problem with interpreting the first component in PCA as a general factor. In both cases, the misspecified random effects model and PCA offer no means for differentiating general and group sources of variance. In the case of PCA, this results in a first component that can reflect more than simply general variance. To date, the problem with applying PCA to multifaceted variables may have gone unappreciated by researchers because previous comparisons of PCA to alternative variance partitioning techniques have largely focused on the cognitive ability test domain, where single-facet variables (e.g., simple test items) often comprise the measures being analyzed (e.g., Floyd et al., Reference Floyd, Shands, Rafael, Bergeron and McGrew2009; Widaman, Reference Widaman, Cudeck and MacCallum2007).

Empirical Illustration of the Problem

To empirically illustrate the issues with PCA noted above, we reanalyzed AC data from Putka and Hoffman (Reference Putka and Hoffman2013). Specifically, we evaluated the magnitude of variance accounted for by the first principle component from a PCA and variance accounted for by person main effects from two different random effects models applied to WEDR variables from that study (i.e., variables defined by unique dimension-exercise combinations). The first random effects model was fully specified and included terms that indexed dimension- and exercise-related effects (i.e., the structure of the model reflected the multifaceted nature of the variables being analyzed; Woehr, Putka, & Bowler, Reference Woehr, Putka and Bowler2012). That model provided the following decomposition of observed variance:

(1)$$\begin{equation} \sigma ^2 _{\,\,\,\,\rm observed} \;{\rm = }\;\sigma ^{\rm 2} _{\,\,\,\rm p} \;{\rm + }\;\sigma ^{\rm 2} _{{\,\,\,\rm pd}} \;{\rm + }\;\sigma ^{\rm 2} _{{\,\,\,\rm pe}} \;{\rm + }\;\sigma ^{\rm 2} _{{\,\,\,\rm residual}}\end{equation}$$

In Equation 1, σ2p reflects the variance due to person main effects, σ2pd reflects the variance due to dimension-specific effects, σ2pe reflects variance due to exercise-specific effects, and σ2residual reflects residual variance.

The second random effects model ignored the multifaceted nature of the data (akin to PCA) and simply treated each WEDR variable as a single faceted item. This second, misspecified random effects model provided the following decomposition of observed variance:

(2)$$\begin{equation} \sigma ^2 _{{\,\,\,\rm observed}} \;{\rm = }\;\sigma ^{\rm 2} _{\,\,\,\rm p} \;{\rm + }\;\sigma ^{\rm 2} _{{\,\,\,\rm residual}}\end{equation}$$

Next, we examined the results from the PCA and random effects model analyses to determine what they implied about the magnitude of a general factor underlying WEDRs. As Table 1 shows across the three samples of AC data from Putka and Hoffman (Reference Putka and Hoffman2013), results of the PCA suggested that the first principal component accounted for between 32% and 36% of the variance in the WEDRs. In contrast, the fully specified random effects model revealed that person main effects accounted for 15% to 20% of the variance in the WEDRs—much less variance than the first principal component. Furthermore, given the person main effect variance component may reflect the contribution of a general factor and covariance among group factors (Woehr et al., Reference Woehr, Putka and Bowler2012), the estimates above can be viewed as upper bounds on the magnitude of variance in WEDRs attributable to a general factor.

Table 1. Comparison of PCA and Random Effects-Based Estimates

Note. Principal components analysis (PCA) = percentage of within-exercise dimension ratings (WEDR) variance accounted for by the first principal component from the PCA. σ2p, M1 = percentage of observed WEDR variance accounted for by person main effects based on the fully specified random effects model. σ2p, M2 = percentage of observed WEDR variance accounted for by person main effects based on the misspecified random effects model. Analyses based on samples from Putka and Hoffman (Reference Putka and Hoffman2013).

Examination of the results for the deliberately misspecified random effects model helps explain the differences in estimates above. On the basis of the misspecified random effects model, we find that person main effects now appear to account for 28% to 32% of variance in the WEDRs. Note the estimates from this misspecified model are far closer to the variance accounted for by the first principal component from the PCA, and the estimates demonstrate that failing to account for group-specific variance stemming from specific dimensions and exercises artificially inflates the magnitude of variance attributable to the person main effect.Footnote 1

To help further illustrate how results regarding the first principal component can be misleading when it comes to informing the presence and dominance of a general factor, we also examined the percentage of observed variance in WEDRs attributable to other components in the fully specified random effects model. This examination revealed that person main effect variance wasn't even the “dominant” source of variance, but rather, group-specific variance attributable to each exercise underlying the WEDRs was dominant. In all samples examined, the percentages of observed variance attributable to exercise-specific effects (σ2pe) were greater than were those for person main effects (39.4% vs. 20.9% in Sample 1, 48.5% vs. 15.7% in Sample 2, and 39.6% vs. 20.4% in Sample 3). Given simple application of PCA to these data, one would have had no way to realize the findings above and might have come to a very different conclusion regarding the presence and magnitude of a general factor underlying the AC data.

Summary

The moral of our commentary is that it is hazardous to equate the first component of a PCA with a general factor and that this may become particularly problematic when applying PCA to multifaceted variables. Although the first point is evident based on the mathematics underlying PCA (e.g., Widaman, Reference Widaman, Cudeck and MacCallum2007), the latter point has arguably been underappreciated given the common evaluation of PCA within the cognitive abilities domain, where variables are often defined along a single facet of measurement. The variance associated with the first principal component can be viewed as reflecting multiple sources of variance, only one of which corresponds to a general factor. PCA does not help to determine the contribution of a general factor to variance underlying that first component, and interpreting it as such can lead to faulty conclusions regarding the dominance of a general factor, particularly when analyzing multifaceted variables. Random effects models underlying generalizability theory may be useful when confronted with partitioning variance in multifaceted variables and can help provide an upper bound estimate on the contribution of a general factor to observed variance in multifaceted assessment data.

Footnotes

1 Given that PCA attempts to extract all observed variance, not just common variance, we hypothesize that the PCA estimates are slightly higher than are the estimates for person main effect variance in the misspecified random effects model because error variance is also contributing to variance in the first principal component, whereas such variance is not reflected in the person main effect variance in the misspecified random effects model.

References

Arthur, W. Jr., Woehr, D. J., & Maldegan, R. (2000). Convergent and discriminant validity of assessment center dimensions: A conceptual and empirical reexamination of the assessment center construct-related validity paradox. Journal of Management, 26, 813835.Google Scholar
Bowler, M. C., & Woehr, D. J. (2006). A meta-analytic evaluation of the impact of dimension and exercise factors on assessment center ratings. Journal of Applied Psychology, 91, 11141124. doi:10.1037/0021-9010.91.5.1114Google Scholar
Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral measurements: Theory of generalizability for scores and profiles. New York, NY: Wiley.Google Scholar
Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological Methods, 4, 272299. doi:10.1037/1082-989X.4.3.272Google Scholar
Floyd, R. G., Shands, E. I., Rafael, F. A., Bergeron, R., & McGrew, K. S. (2009). The dependability of general-factor loadings: The effects of factor-extraction methods, test battery composition, test battery size, and their interactions. Intelligence, 37, 453465. doi:10.1016/j.intell.2009.05.003Google Scholar
Hoffman, B. J., Lance, C. E., Bynum, B., & Gentry, W. A. (2010). Rater source effects are alive and well after all. Personnel Psychology, 63, 119151. doi:10.1111/j.1744-6570.2009.01164.xGoogle Scholar
Kuncel, N. R., & Sackett, P. R. (2014). Resolving the assessment center construct validity problem (as we know it). Journal of Applied Psychology, 99, 3847. doi:10.1037/a0034147Google Scholar
Lance, C. E. (2008). Why assessment centers do not work the way they are supposed to. Industrial and Organizational Psychology: Perspectives on Science and Practice, 1, 8497. doi:10.1111/j.1754-9434.2007.00017.xGoogle Scholar
Putka, D. J., & Hoffman, B. J. (2013). Clarifying the contribution of assessee-, dimension-, exercise-, and assessor-related effects to reliable and unreliable variance in assessment center ratings. Journal of Applied Psychology, 98, 114133. doi:10.1037/a0030887CrossRefGoogle ScholarPubMed
Putka, D. J., & Sackett, P. R. (2010). Reliability and validity. In Farr, J. L. & Tippins, N. T. (Eds.), Handbook of employee selection (pp. 949). New York, NY: Routledge.Google Scholar
Ree, M. J., Carretta, T. R., & Teachout, M. S. (2015). Pervasiveness of dominant general factors in organizational measurement. Industrial and Organizational Psychology: Perspectives on Science and Practice, 8 (3), 409427.Google Scholar
Scullen, S. E., Mount, M. K., & Goff, M. (2000). Understanding the latent structure of job performance ratings. Journal of Applied Psychology, 85, 956970. doi:10.1037//0021-9010.85.6.956Google Scholar
Shavelson, R. J., & Webb, N. M. (2005). Generalizability theory. In Green, J. L., Camilli, G., & Elmore, P. B. (Eds.), Complementary methods for research in education (3rd ed., pp. 599612). Washington, DC: AERA.Google Scholar
Widaman, K. F. (2007). Common factors versus components: Principals and principles, errors and misconceptions. In Cudeck, R. & MacCallum, R. C. (Eds.), Factor analysis at 100: Historical developments and future directions (pp. 177204). Mahwah, NJ: Erlbaum.Google Scholar
Woehr, D. J., Putka, D. J., & Bowler, M. C. (2012). An examination of G-theory methods for modeling multitrait–multimethod data: Clarifying links to construct validity and confirmatory factor analysis. Organizational Research Methods, 15, 134161. doi:10.1177/1094428111408616Google Scholar
Figure 0

Table 1. Comparison of PCA and Random Effects-Based Estimates