Why is it important to evaluate factor indeterminacy
and factor score estimates?
Factor score indeterminacy presents
a number of fascinating and difficult measurement problems that were recently
debated and discussed in a special issue of Multivariate
Behavioral Research. Readers who wish to delve into the measurement
problems surrounding factor score indeterminacy should read all fifteen
articles that appeared in that issue. The focus of this web site, however,
is exclusively on the practical nature of the problem (which was also addressed
in the special issue of MBR). Although it may not be immediately apparent,
theoretical – and perhaps even philosophical – measurement issues always
have important practical implications for applied research. Factor score
indeterminacy is no exception, and there are at least four reasons why
researchers should be concerned with this issue in the context of an exploratory
A highly indeterminate factor is one in which radically different factor
scores can be computed that will all be consistent with the same factor
loadings (pattern coefficients) derived from the factor analysis. As mentioned
above, individuals with high scores according to one set of factor
scores can have low scores according to a competing set of factor scores,
and both sets of scores would be “correct.” One must ask the question,
what value is a factor that cannot yield an unambiguous rank-ordering of
the individuals in the analysis? It seems that an indeterminate
factor is of dubious scientific value, and researchers should assess the
degree of indeterminacy in their common factors.
A highly indeterminate factor yields factor score estimates that may not
be highly correlated with the factor itself. This issue is essentially
a question of validity. For example, an unwary researcher may label one
of several orthogonal factors as “neuroticism” and fail to realize that
the factor score estimates are saturated with unexplained sources of variance.
The disparity between the labeled factor and the factor score estimates
will also be carried over to subsequent analyses in which the estimates
are employed. For example, factor score estimates are often used as variables
in ANOVA or regression analyses. The interpretation of the results obtained
from these analyses will all be predicated on the erroneous assumption
that the factor score estimates are valid representations of the factors
they are intended to measure. Jum Nunnally summarized this point as follows:
“If the multiple correlation [the proportion of determinacy in the
factor] is less than .70, one is in trouble. In that instance the error
variance in estimating the factor would be approximately the same as the
valid variance. At a very minimum, one should be quite suspicious of factor
estimates obtained with a multiple correlation of less than .50, because
in that case less than 25 percent of the variance of factor scores can
be predicted from the variables. Then one could not trust the variables
as actually representing the factor, and it would be of dubious value to
perform further studies supposedly concerning the factor....[and] the factor
should be ‘released’ to other scientists only when good estimates of factor
scores are possible.” (1978, p. 426).
Assessing the degree of indeterminacy in a set of factors is hence an extremely
important step in the entire research program that incorporates an exploratory,
common factor analysis.
Even for a highly determinate factor one can choose a poor method of computing
factor score estimates. The extant methods for computing factor score estimates
grew out of the indeterminacy debate. Each method has its strengths and
weaknesses and none offers a solution to indeterminacy. Some methods may
also be severely flawed in the sense that they yield factor score estimates
that are very poor representations of the factors. In other words, even
though the factor may have a small proportion of indeterminacy, the factor
score estimates computed by the researcher may be invalid representations
of the factor. As discussed with the second reason above, the consequent
lack of validity will carry over into analyses based upon the factor score
estimates. The point is that one needs not only to evaluate indeterminacy,
but the statistical properties of the factor score estimates as well.
Principal components and image common factors are determinate in nature.
Hence, refined component or image scores
will be synonymous with the components or image factors themselves. Coarse
component or image scores, however, should be considered as estimates --
imperfect representations of the extracted components or image factors.
A researcher may choose a dubious method for computing coarse component
or image scores; for instance, by summing items selected on the basis of
the rotated structure coefficients rather than the factor score coefficients.
In such an instance the estimates will likely stand as poor representations
of the latent variates. The properties of coarse component or image scores
should therefore be routinely assessed even though the components and image
factors are determinate.