This page shows an example of a factor analysis with footnotes
explaining the output. The data used in this example were collected by
Professor James Sidanius, who has generously shared them with us. You can
download the data set here.
Overview: The "what" and "why" of factor analysis
Factor analysis is a method of data reduction. It does this by seeking
underlying unobservable (latent) variables that are reflected in the observed
variables (manifest variables). There are many different methods that can be
used to conduct a factor analysis (such as principal axis factor, maximum
likelihood, generalized least squares, unweighted least squares), There are also
many different types of rotations that can be done after the initial extraction
of factors, including orthogonal rotations, such as varimax and equimax, which
impose the restriction that the factors cannot be correlated, and oblique
rotations, such as promax, which allow the factors to be correlated with one
another. You also need to determine the number of factors that you want to
extract. Given the number of factor analytic techniques and options, it is not
surprising that different analysts could reach very different results analyzing
the same data set. However, all analysts are looking for simple structure.
Simple structure is pattern of results such that each variable loads highly onto
one and only one factor.
Factor analysis is a technique that requires a large sample size.
Factor analysis is based on the correlation matrix of the variables involved,
and correlations usually need a large sample size before they stabilize.
Tabachnick and Fidell (2001, page 588) cite Comrey and Lee’s (1992) advise
regarding sample size: 50 cases is very poor, 100 is poor, 200 is fair, 300 is
good, 500 is very good, and 1000 or more is excellent. As a rule of thumb,
a bare minimum of 10 observations per variable is necessary to avoid
computational difficulties.
For the example below, we are going to do a rather "plain vanilla" factor
analysis. We will use iterated principal axis factor with three factors as our
method of extraction, a varimax rotation, and for comparison, we will also show
the promax oblique solution. The determination of the number of factors to
extract should be guided by theory, but also informed by running the analysis
extracting different numbers of factors and seeing which number of factors
yields the most interpretable results. We have used the priors = smc
option on the proc factor statement so that the squared multiple
correlation is used on the diagonal of the correlation matrix. (If this
option is not used, 1’s are on the diagonal, and you will do a principal
components analysis instead of a principal axis factor analysis.)
In this example we have included many options, including the original
correlation matrix, the scree plot and the eigenvectors. While you may not
wish to use all of these options, we have included them here to aid in the
explanation of the analysis. We have also created a page of annotated
output for a principal components analysis that parallels this analysis.
For general information regarding the similarities and differences between
principal components analysis and factor analysis, see Tabachnick and Fidell,
for example.
The FACTOR Procedure
Correlations
ITEM13 ITEM14 ITEM15
ITEM13 INSTRUC WELL PREPARED 1.00000 0.66146 0.59999
ITEM14 INSTRUC SCHOLARLY GRASP 0.66146 1.00000 0.63460
ITEM15 INSTRUCTOR CONFIDENCE 0.59999 0.63460 1.00000
ITEM16 INSTRUCTOR FOCUS LECTURES 0.56626 0.50003 0.50535
ITEM17 INSTRUCTOR USES CLEAR RELEVANT EXAMPLES 0.57687 0.55150 0.58664
ITEM18 INSTRUCTOR SENSITIVE TO STUDENTS 0.40898 0.43311 0.45707
ITEM19 INSTRUCTOR ALLOWS ME TO ASK QUESTIONS 0.28632 0.32041 0.35869
ITEM20 INSTRUCTOR IS ACCESSIBLE TO STUDENTS OUTSIDE CLASS 0.30418 0.31481 0.35568
ITEM21 INSTRUCTOR AWARE OF STUDENTS UNDERSTANDING 0.47553 0.44896 0.50904
ITEM22 I AM SATISFIED WITH STUDENT PERFORMANCE EVALUATION 0.33255 0.33313 0.36884
ITEM23 COMPARED TO OTHER INSTRUCTORS, THIS INSTRUCTOR IS 0.56399 0.56461 0.58233
ITEM24 COMPARED TO OTHER COURSES THIS COURSE WAS 0.45360 0.44281 0.43481
Correlations
ITEM16 ITEM17 ITEM18
ITEM13 INSTRUC WELL PREPARED 0.56626 0.57687 0.40898
ITEM14 INSTRUC SCHOLARLY GRASP 0.50003 0.55150 0.43311
ITEM15 INSTRUCTOR CONFIDENCE 0.50535 0.58664 0.45707
ITEM16 INSTRUCTOR FOCUS LECTURES 1.00000 0.58649 0.40479
ITEM17 INSTRUCTOR USES CLEAR RELEVANT EXAMPLES 0.58649 1.00000 0.55474
ITEM18 INSTRUCTOR SENSITIVE TO STUDENTS 0.40479 0.55474 1.00000
ITEM19 INSTRUCTOR ALLOWS ME TO ASK QUESTIONS 0.33540 0.44930 0.62660
ITEM20 INSTRUCTOR IS ACCESSIBLE TO STUDENTS OUTSIDE CLASS 0.31676 0.41682 0.52055
ITEM21 INSTRUCTOR AWARE OF STUDENTS UNDERSTANDING 0.45245 0.59526 0.55417
ITEM22 I AM SATISFIED WITH STUDENT PERFORMANCE EVALUATION 0.36255 0.44976 0.53609
ITEM23 COMPARED TO OTHER INSTRUCTORS, THIS INSTRUCTOR IS 0.45880 0.61302 0.56950
ITEM24 COMPARED TO OTHER COURSES THIS COURSE WAS 0.42967 0.52058 0.47382
Correlations
ITEM19 ITEM20 ITEM21
ITEM13 INSTRUC WELL PREPARED 0.28632 0.30418 0.47553
ITEM14 INSTRUC SCHOLARLY GRASP 0.32041 0.31481 0.44896
ITEM15 INSTRUCTOR CONFIDENCE 0.35869 0.35568 0.50904
ITEM16 INSTRUCTOR FOCUS LECTURES 0.33540 0.31676 0.45245
ITEM17 INSTRUCTOR USES CLEAR RELEVANT EXAMPLES 0.44930 0.41682 0.59526
ITEM18 INSTRUCTOR SENSITIVE TO STUDENTS 0.62660 0.52055 0.55417
ITEM19 INSTRUCTOR ALLOWS ME TO ASK QUESTIONS 1.00000 0.44647 0.49921
ITEM20 INSTRUCTOR IS ACCESSIBLE TO STUDENTS OUTSIDE CLASS 0.44647 1.00000 0.42479
ITEM21 INSTRUCTOR AWARE OF STUDENTS UNDERSTANDING 0.49921 0.42479 1.00000
ITEM22 I AM SATISFIED WITH STUDENT PERFORMANCE EVALUATION 0.48404 0.38297 0.50651
ITEM23 COMPARED TO OTHER INSTRUCTORS, THIS INSTRUCTOR IS 0.44401 0.40962 0.59751
ITEM24 COMPARED TO OTHER COURSES THIS COURSE WAS 0.37383 0.35722 0.49977
Correlations
ITEM22 ITEM23 ITEM24
ITEM13 INSTRUC WELL PREPARED 0.33255 0.56399 0.45360
ITEM14 INSTRUC SCHOLARLY GRASP 0.33313 0.56461 0.44281
ITEM15 INSTRUCTOR CONFIDENCE 0.36884 0.58233 0.43481
ITEM16 INSTRUCTOR FOCUS LECTURES 0.36255 0.45880 0.42967
ITEM17 INSTRUCTOR USES CLEAR RELEVANT EXAMPLES 0.44976 0.61302 0.52058
ITEM18 INSTRUCTOR SENSITIVE TO STUDENTS 0.53609 0.56950 0.47382
ITEM19 INSTRUCTOR ALLOWS ME TO ASK QUESTIONS 0.48404 0.44401 0.37383
ITEM20 INSTRUCTOR IS ACCESSIBLE TO STUDENTS OUTSIDE CLASS 0.38297 0.40962 0.35722
ITEM21 INSTRUCTOR AWARE OF STUDENTS UNDERSTANDING 0.50651 0.59751 0.49977
ITEM22 I AM SATISFIED WITH STUDENT PERFORMANCE EVALUATION 1.00000 0.49317 0.44440
ITEM23 COMPARED TO OTHER INSTRUCTORS, THIS INSTRUCTOR IS 0.49317 1.00000 0.70464
ITEM24 COMPARED TO OTHER COURSES THIS COURSE WAS 0.44440 0.70464 1.00000
The table above was included in the output because we included the corr
option on the proc factor statement. This table gives the correlations
between the original variables (which are specified on the var
statement). Before conducting a principal components analysis, you want to
check the correlations between the variables. If any of the correlations are
too high (say above .9), you may need to remove one of the variables from the
analysis, as the two variables seem to be measuring the same thing. Another
alternative would be to combine the variables in some way (perhaps by taking the
average). If the correlations are too low, say below .1, then one or more of
the variables might load only onto one factor (in other words, make its own
factor). This is not helpful, as the whole point of the analysis is to reduce
the number of items (variables).
a. Prior Communality Estimates: SMC – This gives the
communality estimates prior to the rotation. The communalities (also known
as h2) are the estimates of the variance of the factors, as opposed
to the variance of the variable which includes measurement error.
b. Eigenvalue – This is the initial eigenvalue. An eigenvalue is
the variance of the factor. Because this is an unrotated solution, the
first factor will account for the most variance, the second will account for the
second highest amount of variance, and so on. Some of the eigenvalues are
negative because the matrix is not of full rank. This means that there are
probably only four dimensions (corresponding to the four factors whose
eigenvalues are greater than zero). Although it is strange to have a
negative variance, this happens because the factor analysis is only analyzing
the common variance, which is less than the total variance. If we were
doing a principal components analysis, we would have had 1’s on the diagonal,
which means that all of the variance is being analyzed (which is another way of
saying that we are assuming that we have no measurement error), and we would not
have negative eigenvalues. In general, it is not uncommon to have negative
eigenvalues.
c. Difference – This column gives the difference between the
eigenvalues. For example, 5.05 = 5.77 – 0.72. This column allows you
to see how quickly the eigenvalues are decreasing.
d. Proportion – This is the proportion of the total variance that each
factor accounts for. For example, 0.9408 = 5.77/6.139.
e. Cumulative – This is the sum of the proportion column. For
example, 1.0584 = 0.9408 + 0.1176.
3 factors will be retained by the NFACTOR criterion.
Initial Factor Method: Iterated Principal Factor Analysis
Scree Plot of Eigenvalues
|
6 +
| 1
|
|
|
|
5 +
|
|
|
|
|
4 +
|
|
E |
i |
g |
e 3 +
n |
v |
a |
l |
u |
e 2 +
s |
|
|
|
|
1 +
|
| 2
|
| 3
|
0 + 4 5 6 7 8
| 9 0 1 2
|
|
|
|
-1 +
----+------+------+------+------+------+------+------+------+------+------+------+------+----
0 1 2 3 4 5 6 7 8 9 10 11 12
Number
The scree plot graphs the eigenvalue against the factor number. You can see
these values in the first two columns of the table immediately above. From the
third factor on, you can see that the line is almost flat, meaning the each
successive factor is accounting for smaller and smaller amounts of the total
variance.
f. Iteration – This column lists the number of the iteration. In
this analysis, seven iterations were required before the criteria was met.
g. Change – When the change becomes smaller than the criterion,
the iterating process stops. The numbers in this column are the largest absolute
difference between iterations. For example, the difference between the
first and the second iteration for item23 is 0.0314 = 0.73027 – 0.76168. The difference given for the first iteration is the difference
between the values at the first iteration and the squared multiple correlations
(sometimes called iteration 0).
h. Communalities – These are the communality estimates at each
iteration. For each iteration, the communality for each variable is
listed. For example, 0.63235 is the communality for the first variable.
Eigenvalues of the Reduced Correlation Matrix: Total = 7.01500876 Average = 0.58458406
Eigenvaluei Differencej Proportionk Cumulativel
1 5.85107872 5.04474488 0.8341 0.8341
2 0.80633384 0.44633935 0.1149 0.9490
3 0.35999449 0.22853697 0.0513 1.0003
4 0.13145752 0.07654351 0.0187 1.0191
5 0.05491400 0.02332205 0.0078 1.0269
6 0.03159195 0.03030953 0.0045 1.0314
7 0.00128242 0.00617263 0.0002 1.0316
8 -.00489021 0.01439730 -0.0007 1.0309
9 -.01928750 0.02693109 -0.0027 1.0281
10 -.04621859 0.01408519 -0.0066 1.0216
11 -.06030378 0.03064032 -0.0086 1.0130
12 -.09094410 -0.0130 1.0000
Initial Factor Method: Iterated Principal Factor Analysis
Eigenvectorsm
1 2 3
ITEM13 INSTRUC WELL PREPARED 0.29486 -0.44338 0.15269
ITEM14 INSTRUC SCHOLARLY GRASP 0.29074 -0.37797 0.16283
ITEM15 INSTRUCTOR CONFIDENCE 0.29819 -0.27308 0.17609
ITEM16 INSTRUCTOR FOCUS LECTURES 0.26782 -0.21061 0.18546
ITEM17 INSTRUCTOR USES CLEAR RELEVANT EXAMPLES 0.32375 -0.08174 0.11086
ITEM18 INSTRUCTOR SENSITIVE TO STUDENTS 0.30573 0.38422 0.18861
ITEM19 INSTRUCTOR ALLOWS ME TO ASK QUESTIONS 0.25481 0.46205 0.25759
ITEM20 INSTRUCTOR IS ACCESSIBLE TO STUDENTS OUTSIDE CLASS 0.22744 0.26655 0.15574
ITEM21 INSTRUCTOR AWARE OF STUDENTS UNDERSTANDING 0.30252 0.13020 0.00062
ITEM22 I AM SATISFIED WITH STUDENT PERFORMANCE EVALUATION 0.25337 0.29080 -0.03839
ITEM23 COMPARED TO OTHER INSTRUCTORS, THIS INSTRUCTOR IS 0.33865 -0.02903 -0.57488
ITEM24 COMPARED TO OTHER COURSES THIS COURSE WAS 0.28729 0.02042 -0.64369
i. Eigenvalue – This is the eigenvalue obtained after the
principal axis factoring but before the varimax rotation. An eigenvalue is
the variance of the factor. Because this is an unrotated solution, the
first factor will account for the most variance, the second will account for the
second highest amount of variance, and so on. Some of the eigenvalues are
negative because the matrix is not of full rank. This means that there are
probably only four dimensions (corresponding to the four factors whose
eigenvalues are greater than zero). Although it is strange to have a
negative variance, this happens because the factor analysis is only analyzing
the common variance, which is less than the total variance. If we were
doing a principal components analysis, we would have had 1’s on the diagonal,
which means that all of the variance is being analyzed (which is another way of
saying that we are assuming that we have no measurement error), and we would not
have negative eigenvalues. In general, it is not uncommon to have negative
eigenvalues.
j. Difference – This column gives the difference between the
eigenvalues. For example, 5.0447 = 5.85107 – 0.8633. This column
allows you to see how quickly the eigenvalues are decreasing.
k. Proportion – This is the proportion of the total variance
that each factor accounts for. For example, 0.8341 = 5.85107/7.015.
l. Cumulative – This is the sum of the proportion column.
For example, 0.9490 = 0.8341 + 0.1149.
m. Eigenvectors – Eigenvectors are linear combinations of the
original variables. They tell you about the strength of the relationship
between the original variables and the (latent) factors.
Factor Patternn
Factor1 Factor2 Factor3
ITEM13 INSTRUC WELL PREPARED 0.71324 -0.39814 0.09162
ITEM14 INSTRUC SCHOLARLY GRASP 0.70328 -0.33941 0.09770
ITEM15 INSTRUCTOR CONFIDENCE 0.72130 -0.24522 0.10565
ITEM16 INSTRUCTOR FOCUS LECTURES 0.64783 -0.18912 0.11128
ITEM17 INSTRUCTOR USES CLEAR RELEVANT EXAMPLES 0.78311 -0.07340 0.06652
ITEM18 INSTRUCTOR SENSITIVE TO STUDENTS 0.73953 0.34501 0.11316
ITEM19 INSTRUCTOR ALLOWS ME TO ASK QUESTIONS 0.61635 0.41490 0.15455
ITEM20 INSTRUCTOR IS ACCESSIBLE TO STUDENTS OUTSIDE CLASS 0.55015 0.23935 0.09344
ITEM21 INSTRUCTOR AWARE OF STUDENTS UNDERSTANDING 0.73178 0.11691 0.00037
ITEM22 I AM SATISFIED WITH STUDENT PERFORMANCE EVALUATION 0.61288 0.26113 -0.02304
ITEM23 COMPARED TO OTHER INSTRUCTORS, THIS INSTRUCTOR IS 0.81916 -0.02607 -0.34493
ITEM24 COMPARED TO OTHER COURSES THIS COURSE WAS 0.69493 0.01834 -0.38621
n. Factor Pattern – This table contains the unrotated factor
loadings, which are the correlations between the variable and the factor.
Because these are correlations, possible values range from -1 to +1.
o. Final Communality Estimates – This is the proportion of each
variable’s variance that can be explained by the factors (e.g., the underlying
latent continua). The values here indicate the proportion of each
variable’s variance that can be explained by the retained factors prior to the
rotation. Variables with high values are well represented in the common factor
space, while variables with low values are not well represented. (In this
example, we don’t have any particularly low values.) They are the reproduced
variances from the factors that you have extracted. You can find these values
on the diagonal of the reproduced correlation matrix.
p. Total – 7.017407 = 5.8510787 + 0.8063338 +
0.3599945
q. Orthogonal Transformation Matrix – This is the matrix by
which you multiply the unrotated factor matrix to get the rotated factor matrix
Rotated Factor Patternr
Factor1s Factor2s Factor3s
ITEM13 INSTRUC WELL PREPARED 0.77075 0.17432 0.22622
ITEM14 INSTRUC SCHOLARLY GRASP 0.72592 0.21291 0.21693
ITEM15 INSTRUCTOR CONFIDENCE 0.67583 0.29506 0.21852
ITEM16 INSTRUCTOR FOCUS LECTURES 0.59084 0.29271 0.18181
ITEM17 INSTRUCTOR USES CLEAR RELEVANT EXAMPLES 0.58671 0.44625 0.28233
ITEM18 INSTRUCTOR SENSITIVE TO STUDENTS 0.28638 0.73896 0.22512
ITEM19 INSTRUCTOR ALLOWS ME TO ASK QUESTIONS 0.17044 0.72715 0.13462
ITEM20 INSTRUCTOR IS ACCESSIBLE TO STUDENTS OUTSIDE CLASS 0.22779 0.53993 0.15899
ITEM21 INSTRUCTOR AWARE OF STUDENTS UNDERSTANDING 0.40195 0.53341 0.32106
ITEM22 I AM SATISFIED WITH STUDENT PERFORMANCE EVALUATION 0.21766 0.55864 0.29137
ITEM23 COMPARED TO OTHER INSTRUCTORS, THIS INSTRUCTOR IS 0.44901 0.37716 0.66845
ITEM24 COMPARED TO OTHER COURSES THIS COURSE WAS 0.32388 0.32087 0.65159
r. Rotated Factor Pattern – This table contains the rotated
factor loadings, which are the correlations between the variable and the
factor. Because these are correlations, possible values range from -1 to +1.
s. Factor – These columns are the rotated factors that have been
extracted. These are the factors that analysts are most interested in and try
to name. For example, the first factor might be called "instructor competence"
because items like "instructor well prepare" and "instructor competence" load
highly on it. The second factor might be called "relating to students" because
items like "instructor is sensitive to students" and "instructor allows me to
ask questions" load highly on it. The third factor has to do with comparisons
to other instructors and courses.
t. Final Communality Estimates – This is the proportion of each
variable’s variance that can be explained by the factors (e.g., the underlying
latent continua). The values here indicate the proportion of each
variable’s variance that can be explained by the retained factors after the
rotation. Variables with high values are well represented in the common
factor space, while variables with low values are not well represented.
(In this example, we don’t have any particularly low values.) They are the
reproduced variances from the factors that you have extracted. You can
find these values on the diagonal of the reproduced correlation matrix.
u. Total – 7.017407 = 2.9494952 + 2.6557251
+ 1.4121868
The partial output below shows the solution using a promax rotation. As you can see with an oblique rotation, such as a promax
rotation, the factors are permitted to be correlated with one another.
With an orthogonal rotation, such as the varimax shown above, the factors are
not permitted to be correlated (they are orthogonal to one another). Oblique rotations, such as
promax, produce both factor pattern and factor structure matrices. The factor pattern matrix
gives the linear combination of the variables that make up the factors.
The factor structure matrix presents the correlations between the variables and the factors. To
completely interpret an oblique rotation one needs to take into account both the factor pattern and
the factor structure matrices and the correlations among the factors.
Please note that with orthogonal rotations the factor
pattern and the factor structure matrices are the equal.
Inter-Factor Correlations
Factor1 Factor2 Factor3
Factor1 1.00000 0.59249 0.68096
Factor2 0.59249 1.00000 0.64863
Factor3 0.68096 0.64863 1.00000
Rotation Method: Promax (power = 3)
Rotated Factor Pattern (Standardized Regression Coefficients)
Factor1 Factor2 Factor3
ITEM13 INSTRUC WELL PREPARED 0.85071 -0.09207 0.03379
ITEM14 INSTRUC SCHOLARLY GRASP 0.78599 -0.02646 0.02406
ITEM15 INSTRUCTOR CONFIDENCE 0.69724 0.09144 0.01977
ITEM16 INSTRUCTOR FOCUS LECTURES 0.60443 0.12786 -0.00552
ITEM17 INSTRUCTOR USES CLEAR RELEVANT EXAMPLES 0.50870 0.28245 0.09868
ITEM18 INSTRUCTOR SENSITIVE TO STUDENTS 0.06335 0.76328 0.03145
ITEM19 INSTRUCTOR ALLOWS ME TO ASK QUESTIONS -0.04152 0.81872 -0.05898
ITEM20 INSTRUCTOR IS ACCESSIBLE TO STUDENTS OUTSIDE CLASS 0.07314 0.55467 0.00917
ITEM21 INSTRUCTOR AWARE OF STUDENTS UNDERSTANDING 0.22482 0.42982 0.18931
ITEM22 I AM SATISFIED WITH STUDENT PERFORMANCE EVALUATION -0.00866 0.52669 0.19811
ITEM23 COMPARED TO OTHER INSTRUCTORS, THIS INSTRUCTOR IS 0.16276 0.07474 0.71794
ITEM24 COMPARED TO OTHER COURSES THIS COURSE WAS 0.02282 0.04596 0.7487
Rotated Factor Structure (Correlations)
Factor1 Factor2 Factor3
ITEM13 INSTRUC WELL PREPARED 0.81917 0.43388 0.55337
ITEM14 INSTRUC SCHOLARLY GRASP 0.78670 0.45484 0.54213
ITEM15 INSTRUCTOR CONFIDENCE 0.76488 0.51738 0.55388
ITEM16 INSTRUCTOR FOCUS LECTURES 0.67643 0.48240 0.48901
ITEM17 INSTRUCTOR USES CLEAR RELEVANT EXAMPLES 0.74325 0.64786 0.62829
ITEM18 INSTRUCTOR SENSITIVE TO STUDENTS 0.53700 0.82121 0.56968
ITEM19 INSTRUCTOR ALLOWS ME TO ASK QUESTIONS 0.40340 0.75586 0.44379
ITEM20 INSTRUCTOR IS ACCESSIBLE TO STUDENTS OUTSIDE CLASS 0.40803 0.60396 0.41876
ITEM21 INSTRUCTOR AWARE OF STUDENTS UNDERSTANDING 0.60841 0.68582 0.62121
ITEM22 I AM SATISFIED WITH STUDENT PERFORMANCE EVALUATION 0.43831 0.65006 0.53384
ITEM23 COMPARED TO OTHER INSTRUCTORS, THIS INSTRUCTOR IS 0.69593 0.63685 0.87725
ITEM24 COMPARED TO OTHER COURSES THIS COURSE WAS 0.55993 0.54515 0.79411