Confirmatory Factor Analysis: Testing Invariance Across Groups

!!!!! This page is under construction !!!!!

Examples of Latent Class Analysis

Example 1.

At a university you wish to measure satisfaction with the university library among graduate students and faculty members. A four item scale is constructed measuring the satisfaction with library services. You have found that the four items load well on a single factor. But, are these factor loadings different for graduate students and faculty members?

Example 2.

Example 3.

Description of the Data

Let’s pursue Example 1 from above. To provide full credit, this example is adapted from page 156 of Exploratory and Confirmatory Factor Analysis: Understanding Concepts and Applications by Bruce Thompson, and we would recommend seeing that text for further information. The wording of the four items we focus on are below.

Willingness to help users
Giving users individual attention
Employees who deal with users in a caring fashion
Employees who are consistently courteous.

The data file, https://stats.idre.ucla.edu/wp-content/uploads/2016/02/thompson.txt (from the Thompson book), has 100 responses from graduate students and 100 responses from faculty members. Each item was rated on a scale from 1 to 9. These four items have been shown to load well on a single factor that Thompson labels ServAff (service affect, i.e., warmth of affect of those providing library services), but we do not know whether the loadings would differ for graduate students and faculty members.

Some Strategies You Might Try

Before we show how you can analyze this via tests of group invariance using confirmatory factor analysis, let’s consider some other methods that you might use.

Exploratory Factor Analysis – You might try running an exploratory factor analysis for the graduate students and a separate exploratory factor analysis for the faculty members. You might "eyeball" the factor loadings and try to persuade others that the factor structure "looks the same" for graduate students and faculty members. The biggest problem with this is that you do not have a statistical test which tells you whether the factor structure is the same.
Separate Confirmatory Factor Analyses – Similar to the above strategy, you might try running a separate confirmatory factor analysis for each group, one for the graduate students and one for the faculty members. Again, you might try to "eyeball" the results and claim that the factor structure is similar for the two groups. As with the above analysis, you do not have a statistical test to determine if the factor structure is different for the two groups.

Mplus Results Testing for Group Invariance of Factor Structure

The way to test whether the factor structure is the same for the graduate students and faculty members is by running two confirmatory factor analyses. The first analysis will assume that the factor structure is the same for the two groups, and the other analysis will assume that the factor structure is different for the two groups. We can then compare the fit of the two models to assess whether the model which permits a separate factor structure for the two groups fits significantly better than the model which assumes that the two groups have the same factor structure. Below, we will show first the analysis that assumes the factor structure is the same for graduate students and faculty members. The second analysis will assumes that the factor structure is different for the two groups.

Analysis 1: Assuming Group Invariance (the same factor structure for students and faculty)

Here is the Mplus code for running this analysis based on https://stats.idre.ucla.edu/wp-content/uploads/2016/02/thompson.txt.

title: 
  Similar to Thompson page 156 - Constrained Model;
  Only analyzing one factor, ServAff
data: 
  file is "https://stats.idre.ucla.edu/wp-content/uploads/2016/02/thompson.txt";
variable:  
  names  = id type per1 - per12;
  usevar = per1-per4 type ;
  grouping = type (2 = GradStud 3 = Faculty);
model:                                            ! (1) ;                                             
  ServAff by per1* per2 per3 per4;                ! (2) ;
  ServAff@1;                                      ! (3) ;
output: 
  standardized;
  
! Notes;  
! (1) Estimate factor 1 by fixing variance to 1 and freeing all paths ;
! (2) per1* frees first path (default fix to 1);
! (3) fix factor variance to 1 (default is free variance);

Note that the grouping statement tells Mplus that the variable type indicates the grouping and a value of 2 indicates that the observation comes from a graduate students and the value of 3 indicates that the observation comes from a faculty member. The model statement specifies the model for the first and subsequent groups. Because this model assumes group invariance, this one model is applied to the first and subsequent groups, yielding one common model (factor structure) for each group.

Here are excepts of the Mplus results.

SUMMARY OF ANALYSIS

Number of groups                                                 2
Number of observations
   Group GRADSTUD                                              100
   Group FACULTY                                               100

Number of dependent variables                                    4
Number of independent variables                                  0
Number of continuous latent variables                            1

This portion of the output tells us that we have two groups in our analysis, and that there are 100 observations in the group that we labeled GradStud and 100 observations in the group that we labeled Faculty. There are four dependent variables corresponding to our four indicators, and there is one continuous latent variable which corresponds to the factor that underlies these indicators.

TESTS OF MODEL FIT

Chi-Square Test of Model Fit

          Value                             15.013
          Degrees of Freedom                     8
          P-Value                           0.0588

This section contains information about the fit of the model. Later we will use this information to compare this model to the unconstrained model.

MODEL RESULTS

                   Estimates     S.E.  Est./S.E.    Std     StdYX

Group GRADSTUD

 SERVAFF  BY
    PER1               1.473    0.081     18.085    1.473    0.953
    PER2               1.458    0.093     15.645    1.458    0.869
    PER3               1.299    0.103     12.622    1.299    0.768
    PER4               1.417    0.095     14.893    1.417    0.799

 Variances
    SERVAFF            1.000    0.000      0.000    1.000    1.000

 Residual Variances
    PER1               0.217    0.082      2.637    0.217    0.091
    PER2               0.690    0.127      5.444    0.690    0.245
    PER3               1.176    0.184      6.401    1.176    0.411
    PER4               1.135    0.182      6.235    1.135    0.361

Group FACULTY

 SERVAFF  BY
    PER1               1.473    0.081     18.085    1.473    0.965
    PER2               1.458    0.093     15.645    1.458    0.891
    PER3               1.299    0.103     12.622    1.299    0.763
    PER4               1.417    0.095     14.893    1.417    0.887

 Variances
    SERVAFF            1.000    0.000      0.000    1.000    1.000

 Residual Variances
    PER1               0.160    0.061      2.619    0.160    0.069
    PER2               0.551    0.098      5.611    0.551    0.206
    PER3               1.213    0.183      6.613    1.213    0.418
    PER4               0.541    0.095      5.672    0.541    0.212

The above output includes the loadings of the indicators on the factor we named ServAff (bolded). Note how there is output associated with the graduate students and output associated with the faculty members and that the factor loadings are identical for the two groups — this indicates that we have properly specified the constrained model. The fit of the model will suffer to the extent that this assumption, that the factor loadings are identical for the two groups, is inappropriate. Now we will run the analysis which assumes that the factor loadings for the two groups are different so we can compare the fit of the two models.

Analysis 2: Not Assuming Group Invariance (different factor structure for students and faculty)

Here is the Mplus code for running this analysis based on https://stats.idre.ucla.edu/wp-content/uploads/2016/02/thompson.txt

title: 
  Similar to Thompson page 156 - Unconstrained Model;
  Only analyzing one factor, ServAff
data: 
  file is "https://stats.idre.ucla.edu/wp-content/uploads/2016/02/thompson.txt";
variable:  
  names  = id type per1 - per12;
  usevar = per1-per4 type ;
  grouping = type (2 = GradStud 3 = Faculty);
model: ! Overall model;
  ! Estimate factor 1 by fixing variance to 1 and freeing all paths ;
  ServAff by per1* per2 per3 per4; ! per1* frees first path (default fix to 1);
  ServAff@1;                       ! fix factor variance to 1, default is free;
model Faculty:                     ! model for Faculty;
  ServAff  by per1 per2 per3 per4; ! Estimate ServAff structure for Faculty;
                                   ! per1 free, by default, in 2nd group;
output: 
  standardized;

Again, the grouping statement is used to specify that the variable type indicates the grouping and a value of 2 indicates that the observation comes from a graduate students and the value of 3 indicates that the observation comes from a faculty member.

The model statement specifies the model for the first and subsequent groups. Subsequent model statements are used to indicate any differences in the model that should be estimated for the other groups. The model Faculty statement indicates any changes to the model that should be applied for the faculty members, namely that that the faculty members should have their own independent estimates of the factor loadings on ServAff.

Here are excepts of the Mplus results.

TESTS OF MODEL FIT

Chi-Square Test of Model Fit

          Value                              6.796
          Degrees of Freedom                     4
          P-Value                           0.1466

The section above contains information about the fit of the model. Later we will use this information to compare this model to the unconstrained model.

MODEL RESULTS

                   Estimates     S.E.  Est./S.E.    Std     StdYX

Group GRADSTUD

 SERVAFF  BY
    PER1               1.647    0.127     13.012    1.647    0.968
    PER2               1.544    0.140     10.999    1.544    0.877
    PER3               1.519    0.155      9.818    1.519    0.815
    PER4               1.485    0.153      9.726    1.485    0.810

 Variances
    SERVAFF            1.000    0.000      0.000    1.000    1.000

 Residual Variances
    PER1               0.180    0.088      2.042    0.180    0.062
    PER2               0.713    0.128      5.570    0.713    0.230
    PER3               1.164    0.185      6.291    1.164    0.335
    PER4               1.153    0.182      6.325    1.153    0.343

Group FACULTY

 SERVAFF  BY
    PER1               1.291    0.103     12.529    1.291    0.950
    PER2               1.358    0.123     11.058    1.358    0.882
    PER3               1.036    0.135      7.674    1.036    0.686
    PER4               1.314    0.120     10.957    1.314    0.877

 Variances
    SERVAFF            1.000    0.000      0.000    1.000    1.000

 Residual Variances
    PER1               0.181    0.062      2.923    0.181    0.098
    PER2               0.529    0.099      5.331    0.529    0.223
    PER3               1.206    0.180      6.684    1.206    0.529
    PER4               0.521    0.096      5.434    0.521    0.232

The above output includes the loadings of the indicators on the factor we named ServAff. As before there is output associated with the graduate students and faculty members, but now the factor loadings are different for the two groups. But are they statistically significantly different? The next section will answer that question.

Comparing Analysis 1 and Analysis 2

We can now compare the two models. The null hypothesis that we are testing is that the four factor loadings for the graduate students are identical to the factor loadings for the faculty members. The alternative hypothesis is that the factor loadings are not all equal. To test this we examine the fit values from the two models, repeated below.

From analysis 1 (the constrained model) we obtained this measure of the fit of the model.

TESTS OF MODEL FIT

Chi-Square Test of Model Fit
          Value                             15.013
          Degrees of Freedom                     8
          P-Value                           0.0588

From analysis 2 (the unconstrained model) we obtained this measure of fit.

TESTS OF MODEL FIT

Chi-Square Test of Model Fit
          Value                              6.796
          Degrees of Freedom                     4
          P-Value                           0.1466

We can test the null hypothesis by taking the difference in the Chi-Square values,

15.013 – 6.796 = 8.217

and taking the difference in degrees of freedom,

8 – 4 = 4.

We can then look up on a Chi Square table to see if a value of 8.217 with 4 degrees of freedom is significant. The critical value of Chi-Square with 4 df is 9.49. Since 8.217 does not exceed 9.49, we conclude that the unconstrained model does not fit significantly better than the constrained model. Therefore, it would be appropriate to use the constrained model as a means of estimating this factor, using the same factor loadings for the graduate students and faculty members.

The Mplus Program

Here are links to the program, data, and output.

The program for analysis 1 is /mplus/dae/CFAGroupInv1.inp
The output for analysis 1 is/mplus/dae/cfagroupinv1.out
The program for analysis 2 is /mplus/dae/CFAGroupInv2.inp
The output for analysis 2 is/mplus/dae/cfagroupinv2.out
The data file is https://stats.idre.ucla.edu/wp-content/uploads/2016/02/thompson.txt

Cautions, Flies in the Ointment

We have focused on a very simple example here, just to get you started. Here are some problems to be on the lookout for.

The model statement specifies the model for the first group and subsequent groups.

Samples for Writing this Up

Forthcoming.

Confirmatory Factor Analysis: Testing Invariance Across Groups | Mplus Data Analysis Examples