How are factor scores computed (estimated)?

How are factor score estimates computed?

Factor analysts draw a distinction between factor scores and “factor score estimates.” Factor scores fulfill several stipulations of the common factor model (for example, they have unit variance and are perfectly orthogonal when the factors are orthogonal) and are not encountered in practice. Rather, researchers routinely compute and report factor score estimates, which are imperfect approximations of the factors. Factor score estimates will not typically have unit variance, and they will often be intercorrelated even when the factors in the analysis are orthogonal.

There are two general classes of methods for estimating factor scores. The first class has been referred to as the “exact”, “complex”, or “refined” methods by different authors. These methods yield approximately standardized factor score estimates with different properties. For example, Thurstone’s (1935) regression approach produces factor score estimates that maximize determinacy; whereas Anderson and Rubin’s (1956) approach yields factor score estimates that are perfectly orthogonal (uncorrelated). Each of the refined methods, however, has one or more defects. For example, Thurstone’s estimates will be correlated even when the factors are orthogonal, and Anderson and Rubin’s estimates will not maximize determinacy. The most common methods and their different properties are summarized in the table below:

Method	Maximizes Validity	Univocal for Orthogonal Factors	Correlation Preserving
Thurstone’s (1935) “regression” method	Yes	No	No
Berge, Krijnen, Wansbeek, & Shapiro (1999); also, Anderson & Rubin (1956)	No	No	Yes
Bartlett (1937)	No	Yes	No
Harman (1976) “idealized variables”	No	Yes	No

“Maximizes Validity” indicates that the factor score estimates are as highly determinate as possible for a given analysis. “Univocal for Orthogonal Factors” indicates that the factor score estimates will not be contaminated with variance from other orthogonal factors (in distinction to other factor score estimates) in the analysis. “Correlation Preserving” indicates that the correlations among the factor score estimates will match the correlations among the factors themselves. Finally, all of these refined methods are computationally difficult, but several are found in popular statistical software. For example, SPSS computes the regression, Anderson-Rubin (which is appropriate only for orthogonal factors; Berge, Krijnen, Wansbeek, and Shapiro extended Anderson and Rubin's method to correlated factors, but this method is not available in SPSS), and Bartlett factor score estimates. SAS reports only the regression estimates.

The second class of scoring procedures has been referred to as the “inexact”, “unit-weighted”, or “coarse” methods by different authors. The factor score estimates are computed by simply summing the responses of subsets of the factored items. For example, it is common practice to (1) extract and rotate a number of factors, (2) examine the structure coefficients (the correlations between the items and the factors) for salient items using some conventional cut-point such as .30 or .40, and (3) sum the responses of the salient items on each factor to compute the factor score estimate. If an item yields a negative structure coefficient it is subtracted rather than added in the computations, and items on different scales are first standardized before they are summed. These scores are extremely common in the literature, particularly in scale construction efforts, and may be referred to as total, index, sum, domain, facet, scale, or subscale scores. Coarse factor score estimates will invariably be intercorrelated even when the factors are orthogonal, they will not have unit variance, and they will almost certainly correlate less with the factors themselves compared to any one of the refined methods of estimating factor scores.

Which class of methods is best? The refined methods can insure certain statistical properties, such as maximizing determinacy or constraining the factor score estimates to orthogonality. The coarse methods are simple to compute and are generally believed to be more stable across independent samples of observations compared to the refined methods. Early authors also believed that the coarse and refined methods would correlate highly with one another and that any differences would be inconsequential. A number of studies, however, suggest that the differences between the refined and coarse methods may be substantial in practice. It is also unclear that the relative instability of the refined methods is very large. Choosing among the different factor scoring procedures is thus not a straight-forward affair, and the final choice must be dictated by a survey of the extant studies on the differences among the various estimation methods and by the context of the particular research program. Regardless, factor score indeterminacy and the factor score estimates must be evaluated.

Home