!!!!! This page is under construction !!!!!
Examples of Latent Class Analysis
Example 1.
At a university you wish to measure satisfaction with the university library among graduate students and faculty members. A four item scale is constructed measuring the satisfaction with library services. You have found that the four items load well on a single factor. But, are these factor loadings different for graduate students and faculty members?
Example 2.
Example 3.
Description of the Data
Let’s pursue Example 1 from above. To provide full credit, this example is adapted from page 156 of Exploratory and Confirmatory Factor Analysis: Understanding Concepts and Applications by Bruce Thompson, and we would recommend seeing that text for further information. The wording of the four items we focus on are below.
- Willingness to help users
- Giving users individual attention
- Employees who deal with users in a caring fashion
- Employees who are consistently courteous.
The data file, https://stats.idre.ucla.edu/wp-content/uploads/2016/02/thompson.txt (from the Thompson book), has 100 responses from graduate students and 100 responses from faculty members. Each item was rated on a scale from 1 to 9. These four items have been shown to load well on a single factor that Thompson labels ServAff (service affect, i.e., warmth of affect of those providing library services), but we do not know whether the loadings would differ for graduate students and faculty members.
Some Strategies You Might Try
Before we show how you can analyze this via tests of group invariance using confirmatory factor analysis, let’s consider some other methods that you might use.
- Exploratory Factor Analysis – You might try running an exploratory factor analysis for the graduate students and a separate exploratory factor analysis for the faculty members. You might "eyeball" the factor loadings and try to persuade others that the factor structure "looks the same" for graduate students and faculty members. The biggest problem with this is that you do not have a statistical test which tells you whether the factor structure is the same.
- Separate Confirmatory Factor Analyses – Similar to the above strategy, you might try running a separate confirmatory factor analysis for each group, one for the graduate students and one for the faculty members. Again, you might try to "eyeball" the results and claim that the factor structure is similar for the two groups. As with the above analysis, you do not have a statistical test to determine if the factor structure is different for the two groups.
Mplus Results Testing for Group Invariance of Factor Structure
The way to test whether the factor structure is the same for the graduate students and faculty members is by running two confirmatory factor analyses. The first analysis will assume that the factor structure is the same for the two groups, and the other analysis will assume that the factor structure is different for the two groups. We can then compare the fit of the two models to assess whether the model which permits a separate factor structure for the two groups fits significantly better than the model which assumes that the two groups have the same factor structure. Below, we will show first the analysis that assumes the factor structure is the same for graduate students and faculty members. The second analysis will assumes that the factor structure is different for the two groups.
Analysis 1: Assuming Group Invariance (the same factor structure for students and faculty)
Here is the Mplus code for running this analysis based on https://stats.idre.ucla.edu/wp-content/uploads/2016/02/thompson.txt.
title: Similar to Thompson page 156 - Constrained Model; Only analyzing one factor, ServAff data: file is "https://stats.idre.ucla.edu/wp-content/uploads/2016/02/thompson.txt"; variable: names = id type per1 - per12; usevar = per1-per4 type ; grouping = type (2 = GradStud 3 = Faculty); model: ! (1) ; ServAff by per1* per2 per3 per4; ! (2) ; ServAff@1; ! (3) ; output: standardized; ! Notes; ! (1) Estimate factor 1 by fixing variance to 1 and freeing all paths ; ! (2) per1* frees first path (default fix to 1); ! (3) fix factor variance to 1 (default is free variance);
Note that the grouping statement tells Mplus that the variable type indicates the grouping and a value of 2 indicates that the observation comes from a graduate students and the value of 3 indicates that the observation comes from a faculty member. The model statement specifies the model for the first and subsequent groups. Because this model assumes group invariance, this one model is applied to the first and subsequent groups, yielding one common model (factor structure) for each group.
Here are excepts of the Mplus results.
SUMMARY OF ANALYSIS Number of groups 2 Number of observations Group GRADSTUD 100 Group FACULTY 100 Number of dependent variables 4 Number of independent variables 0 Number of continuous latent variables 1
This portion of the output tells us that we have two groups in our analysis, and that there are 100 observations in the group that we labeled GradStud and 100 observations in the group that we labeled Faculty. There are four dependent variables corresponding to our four indicators, and there is one continuous latent variable which corresponds to the factor that underlies these indicators.
TESTS OF MODEL FIT Chi-Square Test of Model Fit Value 15.013 Degrees of Freedom 8 P-Value 0.0588
This section contains information about the fit of the model. Later we will use this information to compare this model to the unconstrained model.
MODEL RESULTS Estimates S.E. Est./S.E. Std StdYX Group GRADSTUD SERVAFF BY PER1 1.473 0.081 18.085 1.473 0.953 PER2 1.458 0.093 15.645 1.458 0.869 PER3 1.299 0.103 12.622 1.299 0.768 PER4 1.417 0.095 14.893 1.417 0.799 Variances SERVAFF 1.000 0.000 0.000 1.000 1.000 Residual Variances PER1 0.217 0.082 2.637 0.217 0.091 PER2 0.690 0.127 5.444 0.690 0.245 PER3 1.176 0.184 6.401 1.176 0.411 PER4 1.135 0.182 6.235 1.135 0.361 Group FACULTY SERVAFF BY PER1 1.473 0.081 18.085 1.473 0.965 PER2 1.458 0.093 15.645 1.458 0.891 PER3 1.299 0.103 12.622 1.299 0.763 PER4 1.417 0.095 14.893 1.417 0.887 Variances SERVAFF 1.000 0.000 0.000 1.000 1.000 Residual Variances PER1 0.160 0.061 2.619 0.160 0.069 PER2 0.551 0.098 5.611 0.551 0.206 PER3 1.213 0.183 6.613 1.213 0.418 PER4 0.541 0.095 5.672 0.541 0.212
The above output includes the loadings of the indicators on the factor we named ServAff (bolded). Note how there is output associated with the graduate students and output associated with the faculty members and that the factor loadings are identical for the two groups — this indicates that we have properly specified the constrained model. The fit of the model will suffer to the extent that this assumption, that the factor loadings are identical for the two groups, is inappropriate. Now we will run the analysis which assumes that the factor loadings for the two groups are different so we can compare the fit of the two models.
Analysis 2: Not Assuming Group Invariance (different factor structure for students and faculty)
Here is the Mplus code for running this analysis based on https://stats.idre.ucla.edu/wp-content/uploads/2016/02/thompson.txt
title: Similar to Thompson page 156 - Unconstrained Model; Only analyzing one factor, ServAff data: file is "https://stats.idre.ucla.edu/wp-content/uploads/2016/02/thompson.txt"; variable: names = id type per1 - per12; usevar = per1-per4 type ; grouping = type (2 = GradStud 3 = Faculty); model: ! Overall model; ! Estimate factor 1 by fixing variance to 1 and freeing all paths ; ServAff by per1* per2 per3 per4; ! per1* frees first path (default fix to 1); ServAff@1; ! fix factor variance to 1, default is free; model Faculty: ! model for Faculty; ServAff by per1 per2 per3 per4; ! Estimate ServAff structure for Faculty; ! per1 free, by default, in 2nd group; output: standardized;
Again, the grouping statement is used to specify that the variable type indicates the grouping and a value of 2 indicates that the observation comes from a graduate students and the value of 3 indicates that the observation comes from a faculty member.
The model statement specifies the model for the first and subsequent groups. Subsequent model statements are used to indicate any differences in the model that should be estimated for the other groups. The model Faculty statement indicates any changes to the model that should be applied for the faculty members, namely that that the faculty members should have their own independent estimates of the factor loadings on ServAff.
Here are excepts of the Mplus results.
TESTS OF MODEL FIT Chi-Square Test of Model Fit Value 6.796 Degrees of Freedom 4 P-Value 0.1466
The section above contains information about the fit of the model. Later we will use this information to compare this model to the unconstrained model.
MODEL RESULTS Estimates S.E. Est./S.E. Std StdYX Group GRADSTUD SERVAFF BY PER1 1.647 0.127 13.012 1.647 0.968 PER2 1.544 0.140 10.999 1.544 0.877 PER3 1.519 0.155 9.818 1.519 0.815 PER4 1.485 0.153 9.726 1.485 0.810 Variances SERVAFF 1.000 0.000 0.000 1.000 1.000 Residual Variances PER1 0.180 0.088 2.042 0.180 0.062 PER2 0.713 0.128 5.570 0.713 0.230 PER3 1.164 0.185 6.291 1.164 0.335 PER4 1.153 0.182 6.325 1.153 0.343 Group FACULTY SERVAFF BY PER1 1.291 0.103 12.529 1.291 0.950 PER2 1.358 0.123 11.058 1.358 0.882 PER3 1.036 0.135 7.674 1.036 0.686 PER4 1.314 0.120 10.957 1.314 0.877 Variances SERVAFF 1.000 0.000 0.000 1.000 1.000 Residual Variances PER1 0.181 0.062 2.923 0.181 0.098 PER2 0.529 0.099 5.331 0.529 0.223 PER3 1.206 0.180 6.684 1.206 0.529 PER4 0.521 0.096 5.434 0.521 0.232
The above output includes the loadings of the indicators on the factor we named ServAff. As before there is output associated with the graduate students and faculty members, but now the factor loadings are different for the two groups. But are they statistically significantly different? The next section will answer that question.
Comparing Analysis 1 and Analysis 2
We can now compare the two models. The null hypothesis that we are testing is that the four factor loadings for the graduate students are identical to the factor loadings for the faculty members. The alternative hypothesis is that the factor loadings are not all equal. To test this we examine the fit values from the two models, repeated below.
From analysis 1 (the constrained model) we obtained this measure of the fit of the model.
TESTS OF MODEL FIT Chi-Square Test of Model Fit Value 15.013 Degrees of Freedom 8 P-Value 0.0588
From analysis 2 (the unconstrained model) we obtained this measure of fit.
TESTS OF MODEL FIT Chi-Square Test of Model Fit Value 6.796 Degrees of Freedom 4 P-Value 0.1466
We can test the null hypothesis by taking the difference in the Chi-Square values,
15.013 – 6.796 = 8.217
and taking the difference in degrees of freedom,
8 – 4 = 4.
We can then look up on a Chi Square table to see if a value of 8.217 with 4 degrees of freedom is significant. The critical value of Chi-Square with 4 df is 9.49. Since 8.217 does not exceed 9.49, we conclude that the unconstrained model does not fit significantly better than the constrained model. Therefore, it would be appropriate to use the constrained model as a means of estimating this factor, using the same factor loadings for the graduate students and faculty members.
The Mplus Program
Here are links to the program, data, and output.
- The program for analysis 1 is /mplus/dae/CFAGroupInv1.inp
- The output for analysis 1 is/mplus/dae/cfagroupinv1.out
- The program for analysis 2 is /mplus/dae/CFAGroupInv2.inp
- The output for analysis 2 is/mplus/dae/cfagroupinv2.out
- The data file is https://stats.idre.ucla.edu/wp-content/uploads/2016/02/thompson.txt
Cautions, Flies in the Ointment
We have focused on a very simple example here, just to get you started. Here are some problems to be on the lookout for.
- The model statement specifies the model for the first group and subsequent groups.
Samples for Writing this Up
Forthcoming.
See Also
- Exploratory and Confirmatory Factor Analysis: Understanding Concepts and Applications by Bruce Thompson