Mplus Textbook Examples
Applied Latent Class Analysis
Chapter 2 Basic Concepts and Procedures in Single- and Multiple-Group Latent Class Analysis
by
Allan L. McCutcheon
Table 2 on page 60 using data set https://stats.idre.ucla.edu/wp-content/uploads/2016/02/page59_a.dat.
Data: File is c:alcahttps://stats.idre.ucla.edu/wp-content/uploads/2016/02/page59_a.dat ; Variable: Names are a b c d group freq; Missing are all (-9999) ; usev are a b c d freq; weight is freq (freq); categorical are a b c d; classes = x(2); Analysis: Type = mixture ; FINAL CLASS COUNTS AND PROPORTIONS FOR THE LATENT CLASSES BASED ON THE ESTIMATED MODEL Latent Classes 1 155.68287 0.72075 2 60.31713 0.27925 RESULTS IN PROBABILITY SCALE Latent Class 1 A Category 1 0.286 0.042 6.841 Category 2 0.714 0.042 17.045 B Category 1 0.646 0.049 13.175 Category 2 0.354 0.049 7.220 C Category 1 0.670 0.051 13.140 Category 2 0.330 0.051 6.461 D Category 1 0.868 0.039 22.325 Category 2 0.132 0.039 3.406 Latent Class 2 A Category 1 0.007 0.025 0.269 Category 2 0.993 0.025 39.267 B Category 1 0.073 0.068 1.088 Category 2 0.927 0.068 13.716 C Category 1 0.060 0.067 0.896 Category 2 0.940 0.067 13.985 D Category 1 0.231 0.098 2.351 Category 2 0.769 0.098 7.833
Table 3 on page 62. The output in the book is produced by LEM and in LEM the default coding scheme is effect coding. On the other hand, the only scheme possible in Mplus is dummy coding. The results obtained from two types of coding are equivalent to each other. Here we show how to convert the results using dummy coding to the results using effect coding.
Data: File is c:alcahttps://stats.idre.ucla.edu/wp-content/uploads/2016/02/page59_a.dat ; Variable: Names are a b c d group freq; Missing are all (-9999) ; usev are a b c d freq; weight is freq (freq); categorical are a b c d; classes = x(2); Analysis: Type = mixture ;
MODEL RESULTS Estimates S.E. Est./S.E. Latent Class 1 Thresholds A$1 -0.913 0.205 -4.457 B$1 0.601 0.214 2.805 C$1 0.710 0.231 3.075 D$1 1.880 0.338 5.556 Latent Class 2 Thresholds A$1 -4.983 3.741 -1.332 B$1 -2.535 0.992 -2.554 C$1 -2.747 1.187 -2.314 D$1 -1.203 0.553 -2.176 Categorical Latent Variables Means X#1 0.948 0.300 3.162
The parameter for X in the book is .948/2 = .474. The rest can be converted as follows. The relationship between the parameters in the book for single variable (S) and two variable (T) with the parameters from Mplus, thresholds for latent class 1 (L1) and thresholds for latent class 2 (L2) is
S + T = L1/2
S – T = L2/2
We did the calculation in Stata:
. list, clean s t 1. -1.472 1.016 2. -.483 .784 3. -.509 .864 4. .169 .771 . gen l1 = (s+t)*2 . gen l2 = (s-t)*2 . list, clean s t l1 l2 1. -1.472 1.016 -.9119999 -4.976 2. -.483 .784 .6019999 -2.534 3. -.509 .864 .71 -2.746 4. .169 .771 1.88 -1.204
Table 4 on page 69 using Ego’s Dilemma Data, https://stats.idre.ucla.edu/wp-content/uploads/2016/02/page59_a.dat. Notice that the AIC and BIC from Mplus output are computed using different formulae than those computed in the book. Even though they are different, but the difference of AIC’s between two models are the same regardless which way they are computed. For example, the difference of AIC’s of the two models in Table 4 is 59.08-(-9.28) = 68.36 based on the output from the book. It is 1095.300 -1026.935 = 68.365 based on the Mplus output. That is they are the same in terms of difference of models and that is how AIC’s are used.
More precisely, the formulae for AIC and BIC from the book are
AIC = G2 – 2*df
BIC = G2– df*[ln(N)],
where df is the number of degrees of freedom and N is the sample size.
The formulae for AIC and BIC from Mplus are
AIC = -2*logL + 2*r
BIC = -2*logL + r*[ln(N)],
where r is the number of free model parameters and N is the sample size.
Model I: Independence
Data: File is c:alcahttps://stats.idre.ucla.edu/wp-content/uploads/2016/02/page59_a.dat; Variable: Names are a b c d group freq; Missing are all (-9999) ; usev are a b c d freq; weight is freq (freq); categorical are a b c d; classes = x(1); Analysis: Type = mixture ; TESTS OF MODEL FIT Loglikelihood H0 Value -543.650 Information Criteria Number of Free Parameters 4 Akaike (AIC) 1095.300 Bayesian (BIC) 1108.801 Sample-Size Adjusted BIC 1096.125 (n* = (n + 2) / 24) Chi-Square Test of Model Fit for the Binary and Ordered Categorical (Ordinal) Outcomes Pearson Chi-Square Value 104.107 Degrees of Freedom 11 P-Value 0.0000 Likelihood Ratio Chi-Square Value 81.084 Degrees of Freedom 11 P-Value 0.0000
Model II: Two-Class LCM
Data: File is c:alcahttps://stats.idre.ucla.edu/wp-content/uploads/2016/02/page59_a.dat ; Variable: Names are a b c d group freq; Missing are all (-9999) ; usev are a b c d freq; weight is freq (freq); categorical are a b c d; classes = x(2); Analysis: Type = mixture ; TESTS OF MODEL FIT Loglikelihood H0 Value -504.468 Information Criteria Number of Free Parameters 9 Akaike (AIC) 1026.935 Bayesian (BIC) 1057.313 Sample-Size Adjusted BIC 1028.793 (n* = (n + 2) / 24) Entropy 0.719 Chi-Square Test of Model Fit for the Binary and Ordered Categorical (Ordinal) Outcomes Pearson Chi-Square Value 2.720 Degrees of Freedom 6 P-Value 0.8431 Likelihood Ratio Chi-Square Value 2.720 Degrees of Freedom 6 P-Value 0.8431
Table 5 on page 71 using Ego’s Dilemma Data, https://stats.idre.ucla.edu/wp-content/uploads/2016/02/page59_a.dat.
Model H1: two-class LCM
This is the model above.
Model H2: H1 + B & C parallel indicators
Data: File is c:alcahttps://stats.idre.ucla.edu/wp-content/uploads/2016/02/page59_a.dat; Variable: Names are a b c d group freq; Missing are all (-9999) ; usev are a b c d freq; weight is freq (freq); categorical are a b c d; classes = x(2); Analysis: Type = mixture ; model: %overall% [b$1 c$1] (1); %x#1% [b$1 c$1] (2);
TESTS OF MODEL FIT Loglikelihood H0 Value -504.551 Information Criteria Number of Free Parameters 7 Akaike (AIC) 1023.101 Bayesian (BIC) 1046.728 Sample-Size Adjusted BIC 1024.546 (n* = (n + 2) / 24) Entropy 0.720 Chi-Square Test of Model Fit for the Binary and Ordered Categorical (Ordinal) Outcomes Pearson Chi-Square Value 2.838 Degrees of Freedom 8 P-Value 0.9441 Likelihood Ratio Chi-Square Value 2.886 Degrees of Freedom 8 P-Value 0.9413
Model H3: H2 + D equal error rate
Data: File is c:alcahttps://stats.idre.ucla.edu/wp-content/uploads/2016/02/page59_a.dat ; Variable: Names are a b c d group freq; Missing are all (-9999) ; usev are a b c d freq; weight is freq (freq); categorical are a b c d; classes = x(2); Analysis: Type = mixture ; model: %overall% [b$1 c$1] (1); [d$1] (p1); %x#1% [b$1 c$1] (2); [d$1] (q1); model constraint: p1 = -q1;
TESTS OF MODEL FIT Loglikelihood H0 Value -504.933 Information Criteria Number of Free Parameters 6 Akaike (AIC) 1021.866 Bayesian (BIC) 1042.117 Sample-Size Adjusted BIC 1023.104 (n* = (n + 2) / 24) Entropy 0.759 Chi-Square Test of Model Fit for the Binary and Ordered Categorical (Ordinal) Outcomes Pearson Chi-Square Value 3.603 Degrees of Freedom 9 P-Value 0.9356 Likelihood Ratio Chi-Square Value 3.650 Degrees of Freedom 9 P-Value 0.9329
Model H4: H3 + A as perfect indicator for class 2
Data: File is c:alcahttps://stats.idre.ucla.edu/wp-content/uploads/2016/02/page59_a.dat ; Variable: Names are a b c d group freq; Missing are all (-9999) ; usev are a b c d freq; weight is freq (freq); categorical are a b c d; classes = x(2); Analysis: Type = mixture ; model: %overall% [b$1 c$1] (1); [d$1] (p1); [a$1@-15]; %x#1% [b$1 c$1] (2); [d$1] (q1); [a$1]; model constraint: p1 = -q1;
TESTS OF MODEL FIT Loglikelihood H0 Value -504.937 Information Criteria Number of Free Parameters 5 Akaike (AIC) 1019.874 Bayesian (BIC) 1036.750 Sample-Size Adjusted BIC 1020.906 (n* = (n + 2) / 24) Entropy 0.763 Chi-Square Test of Model Fit for the Binary and Ordered Categorical (Ordinal) Outcomes Pearson Chi-Square Value 3.605 Degrees of Freedom 10 P-Value 0.9634 Likelihood Ratio Chi-Square Value 3.659 Degrees of Freedom 10 P-Value 0.9614
Table 6 on page 72 based on Model H4.
Data: File is c:alcahttps://stats.idre.ucla.edu/wp-content/uploads/2016/02/page59_a.dat ; Variable: Names are a b c d group freq; Missing are all (-9999) ; usev are a b c d freq; weight is freq (freq); categorical are a b c d; classes = x(2); Analysis: Type = mixture ; model: %overall% [b$1 c$1] (1); [d$1] (p1); [a$1@-15]; %x#1% [b$1 c$1] (2); [d$1] (q1); [a$1]; model constraint: p1 = -q1;
FINAL CLASS COUNTS AND PROPORTIONS FOR THE LATENT CLASSES BASED ON THE ESTIMATED MODEL Latent Classes 1 163.60581 0.75743 2 52.39419 0.24257
RESULTS IN PROBABILITY SCALE Latent Class 1 A Category 1 0.275 0.037 7.506 Category 2 0.725 0.037 19.783 B Category 1 0.636 0.031 20.591 Category 2 0.364 0.031 11.777 C Category 1 0.636 0.031 20.591 Category 2 0.364 0.031 11.777 D Category 1 0.852 0.033 25.909 Category 2 0.148 0.033 4.489 Latent Class 2 A Category 1 0.000 0.000 0.000 Category 2 1.000 0.000 0.000 B Category 1 0.046 0.046 1.012 Category 2 0.954 0.046 20.894 C Category 1 0.046 0.046 1.012 Category 2 0.954 0.046 20.894 D Category 1 0.148 0.033 4.489 Category 2 0.852 0.033 25.909
Table 8 on page 75 using abortion approval data, https://stats.idre.ucla.edu/wp-content/uploads/2016/02/page75.dat.
Model H1: two-class LCM
Data: File is c:alcahttps://stats.idre.ucla.edu/wp-content/uploads/2016/02/page75.dat ; Variable: Names are a b c d freq; Missing are all (-9999) ; usev are a b c d freq; weight is freq (freq); categorical are a b c d; classes = x(2); Analysis: Type = mixture ;
TESTS OF MODEL FIT Loglikelihood H0 Value -2773.793 Information Criteria Number of Free Parameters 9 Akaike (AIC) 5565.586 Bayesian (BIC) 5614.323 Sample-Size Adjusted BIC 5585.731 (n* = (n + 2) / 24) Entropy 0.925 Chi-Square Test of Model Fit for the Binary and Ordered Categorical (Ordinal) Outcomes Pearson Chi-Square Value 214.746 Degrees of Freedom 6 P-Value 0.0000 Likelihood Ratio Chi-Square Value 179.853 Degrees of Freedom 6 P-Value 0.0000
Model H2: three-class model with linear restrictions.
Data: File is c:alcahttps://stats.idre.ucla.edu/wp-content/uploads/2016/02/page75.dat; Variable: Names are a b c d freq; Missing are all (-9999) ; usev are a b c d freq; weight is freq (freq); categorical are a b c d; classes = x(3); Analysis: Type = mixture ; model: %overall% [a$1*-1] (a11); [b$1*-1] (b11); [c$1*-1] (c11); [d$1*-1] (d11); %x#2% [a$1*0] (a12); [b$1*0] (b12); [c$1*0] (c12); [d$1*0] (d12); %x#3% [a$1*1] (a13); [b$1*1] (b13); [c$1*1] (c13); [d$1*1] (d13); model constraint: a13 = 2*a12 - a11; b13 = 2*b12 - b11; c13 = 2*c12 - c11; d13 = 2*d12 - d11; TESTS OF MODEL FIT Loglikelihood H0 Value -2685.032 Information Criteria Number of Free Parameters 10 Akaike (AIC) 5390.063 Bayesian (BIC) 5444.215 Sample-Size Adjusted BIC 5412.447 (n* = (n + 2) / 24) Entropy 0.824 Chi-Square Test of Model Fit for the Binary and Ordered Categorical (Ordinal) Outcomes Pearson Chi-Square Value 2.339 Degrees of Freedom 5 P-Value 0.8005 Likelihood Ratio Chi-Square Value 2.331 Degrees of Freedom 5 P-Value 0.8017
Model H3: H2 + A, B, C restricted to equal association
Data: File is c:alcahttps://stats.idre.ucla.edu/wp-content/uploads/2016/02/page75.dat ; Variable: Names are a b c d freq; Missing are all (-9999) ; usev are a b c d freq; weight is freq (freq); categorical are a b c d; classes = x(3); Analysis: Type = mixture ; starts = 50 5; miteration = 10000; mciterations = 10; iterations =10000; model: %overall% !for x#1 [a$1] (a11); [b$1] (b11); [c$1] (c11); [d$1] (d11); %x#2% [a$1] (a12); [b$1] (b12); [c$1] (c12); [d$1] (d12); %x#3% [a$1] (a13); [b$1] (b13); [c$1] (c13); [d$1] (d13); model constraint: a13 = 2*a12 - a11; b12 = b11 + a12 - a11; b13 = b11 + 2*(a12-a11); c12 = c11 + a12 - a11; c13 = c11 + 2*(a12 - a11); d13 = 2*d12 - d11;
TESTS OF MODEL FIT Loglikelihood H0 Value -2685.653 Information Criteria Number of Free Parameters 8 Akaike (AIC) 5387.305 Bayesian (BIC) 5430.627 Sample-Size Adjusted BIC 5405.212 (n* = (n + 2) / 24) Entropy 0.824 Chi-Square Test of Model Fit for the Binary and Ordered Categorical (Ordinal) Outcomes Pearson Chi-Square Value 3.553 Degrees of Freedom 7 P-Value 0.8296 Likelihood Ratio Chi-Square Value 3.573 Degrees of Freedom 7 P-Value 0.8275
Table 9 on page 77 using model H3 from the example above. As discussed for the output of Table 3, the output produced by Mplus 3 will be different from the book because of the difference in coding scheme.
Data: File is c:alcahttps://stats.idre.ucla.edu/wp-content/uploads/2016/02/page75.dat ; Variable: Names are a b c d freq; Missing are all (-9999) ; usev are a b c d freq; weight is freq (freq); categorical are a b c d; classes = x(3); Analysis: Type = mixture ; starts = 50 5; miteration = 10000; mciterations = 10; iterations =10000; model: %overall% !for x#1 [a$1] (a11); [b$1] (b11); [c$1] (c11); [d$1] (d11); %x#2% [a$1] (a12); [b$1] (b12); [c$1] (c12); [d$1] (d12); %x#3% [a$1] (a13); [b$1] (b13); [c$1] (c13); [d$1] (d13); model constraint: a13 = 2*a12 - a11; b12 = b11 + a12 - a11; b13 = b11 + 2*(a12-a11); c12 = c11 + a12 - a11; c13 = c11 + 2*(a12 - a11); d13 = 2*d12 - d11;
MODEL RESULTS Estimates S.E. Est./S.E. Latent Class 1 Thresholds A$1 -4.301 0.288 -14.942 B$1 -3.847 0.277 -13.870 C$1 -5.335 0.309 -17.266 D$1 -10.045 0.383 -26.215 Latent Class 2 Thresholds A$1 -0.189 0.154 -1.224 B$1 0.265 0.153 1.727 C$1 -1.223 0.169 -7.244 D$1 -0.144 0.192 -0.754 Latent Class 3 Thresholds A$1 3.922 0.219 17.934 B$1 4.376 0.230 18.989 C$1 2.888 0.211 13.684 D$1 9.756 0.010 988.995 Categorical Latent Variables Means X#1 0.162 0.068 2.365 X#2 -0.581 0.092 -6.346
Table 10 on page 79.
Model H1:
Data: File is c:alcapage59_a.dat ; Variable: Names are a b c d group freq; Missing are all (-9999) ; usev are a b c d group freq; weight is freq (freq); categorical are a b c d group; classes = g(2) x(2); Analysis: Type = mixture ; model: %overall% x#1 on g#1; [x#1]; model g: %g#1% [group$1@-15]; %g#2% [group$1@15];
THE MODEL ESTIMATION TERMINATED NORMALLY TESTS OF MODEL FIT Loglikelihood H0 Value -1324.989 Information Criteria Number of Free Parameters 19 Akaike (AIC) 2687.978 Bayesian (BIC) 2765.278 Sample-Size Adjusted BIC 2704.982 (n* = (n + 2) / 24) Entropy 0.863 Chi-Square Test of Model Fit for the Binary and Ordered Categorical (Ordinal) Outcomes Pearson Chi-Square Value 9.063 Degrees of Freedom 12 P-Value 0.6976 Likelihood Ratio Chi-Square Value 8.253 Degrees of Freedom 12 P-Value 0.7650
Model H2:
Data: File is c:alcapage59_a.dat ; Variable: Names are a b c d group freq; Missing are all (-9999) ; usev are a b c d group freq; weight is freq (freq); categorical are a b c d group; classes = g(2) x(2); Analysis: Type = mixture ; model: %overall% x#1 on g#1; [x#1]; model g: %g#1% [group$1@-15]; %g#2% [group$1@15]; model x: %x#1% [a$1 b$1 c$1 d$1]; %x#2% [a$1 b$1 c$1 d$1];
TESTS OF MODEL FIT Loglikelihood H0 Value -1332.597 Information Criteria Number of Free Parameters 11 Akaike (AIC) 2687.194 Bayesian (BIC) 2731.946 Sample-Size Adjusted BIC 2697.039 (n* = (n + 2) / 24) Entropy 0.853 Chi-Square Test of Model Fit for the Binary and Ordered Categorical (Ordinal) Outcomes Pearson Chi-Square Value 24.774 Degrees of Freedom 20 P-Value 0.2102 Likelihood Ratio Chi-Square Value 23.469 Degrees of Freedom 20 P-Value 0.2663
Model H3:
Data: File is c:alca59_a.dat ; Variable: Names are a b c d group freq; Missing are all (-9999) ; usev are a b c d group freq; weight is freq (freq); categorical are a b c d group; classes = g(2) x(2); Analysis: Type = mixture ; model g: %g#1% [group$1@-15]; %g#2% [group$1@15]; model x: %x#1% [a$1 b$1 c$1 d$1]; %x#2% [a$1 b$1 c$1 d$1];
TESTS OF MODEL FIT Loglikelihood H0 Value -1332.603 Information Criteria Number of Free Parameters 10 Akaike (AIC) 2685.205 Bayesian (BIC) 2725.889 Sample-Size Adjusted BIC 2694.155 (n* = (n + 2) / 24) Entropy 0.853 Chi-Square Test of Model Fit for the Binary and Ordered Categorical (Ordinal) Outcomes Pearson Chi-Square Value 24.815 Degrees of Freedom 21 P-Value 0.2553 Likelihood Ratio Chi-Square Value 23.481 Degrees of Freedom 21 P-Value 0.3189
Table 11 on page 80 using Model H3 from previous example.
Data: File is c:alcapage59_a.dat ; Variable: Names are a b c d group freq; Missing are all (-9999) ; usev are a b c d group freq; weight is freq (freq); categorical are a b c d group; classes = g(2) x(2); Analysis: Type = mixture ; model g: %g#1% [group$1@-15]; %g#2% [group$1@15]; model x: %x#1% [a$1 b$1 c$1 d$1]; %x#2% [a$1 b$1 c$1 d$1];
LATENT TRANSITION PROBABILITIES BASED ON THE ESTIMATED MODEL G Classes (Rows) by X Classes (Columns) 1 2 1 0.292 0.708 2 0.292 0.708
RESULTS IN PROBABILITY SCALE Latent Class Pattern 1 1 A Category 1 0.010 0.023 0.454 Category 2 0.990 0.023 42.848 B Category 1 0.108 0.057 1.895 Category 2 0.892 0.057 15.711 C Category 1 0.021 0.058 0.355 Category 2 0.979 0.058 16.940 D Category 1 0.319 0.072 4.434 Category 2 0.681 0.072 9.474 GROUP Category 1 0.000 0.000 0.000 Category 2 1.000 0.000 0.000 Latent Class Pattern 1 2 A Category 1 0.345 0.034 10.072 Category 2 0.655 0.034 19.096 B Category 1 0.567 0.034 16.808 Category 2 0.433 0.034 12.850 C Category 1 0.717 0.043 16.491 Category 2 0.283 0.043 6.511 D Category 1 0.849 0.029 28.966 Category 2 0.151 0.029 5.150 GROUP Category 1 0.000 0.000 0.000 Category 2 1.000 0.000 0.000 Latent Class Pattern 2 1 A Category 1 0.010 0.023 0.454 Category 2 0.990 0.023 42.848 B Category 1 0.108 0.057 1.895 Category 2 0.892 0.057 15.711 C Category 1 0.021 0.058 0.355 Category 2 0.979 0.058 16.940 D Category 1 0.319 0.072 4.434 Category 2 0.681 0.072 9.474 GROUP Category 1 1.000 0.000 0.000 Category 2 0.000 0.000 0.000 Latent Class Pattern 2 2 A Category 1 0.345 0.034 10.072 Category 2 0.655 0.034 19.096 B Category 1 0.567 0.034 16.808 Category 2 0.433 0.034 12.850 C Category 1 0.717 0.043 16.491 Category 2 0.283 0.043 6.511 D Category 1 0.849 0.029 28.966 Category 2 0.151 0.029 5.150 GROUP Category 1 1.000 0.000 0.000 Category 2 0.000 0.000 0.000
Table 12 on page 81 using model H3 from previous example. Notice that because of the difference in terms of coding schemes, the results from Mplus 3 are different from the results in the book. But they are equivalent and can be converted from each other.
MODEL RESULTS Estimates S.E. Est./S.E. Latent Class Pattern 1 1 Thresholds A$1 -4.548 2.227 -2.042 B$1 -2.115 0.591 -3.576 C$1 -3.865 2.875 -1.344 D$1 -0.759 0.331 -2.293 GROUP$1 -15.000 0.000 0.000 Latent Class Pattern 1 2 Thresholds A$1 -0.640 0.152 -4.218 B$1 0.269 0.137 1.955 C$1 0.929 0.214 4.338 D$1 1.727 0.229 7.552 GROUP$1 -15.000 0.000 0.000 Latent Class Pattern 2 1 Thresholds A$1 -4.548 2.227 -2.042 B$1 -2.115 0.591 -3.576 C$1 -3.865 2.875 -1.344 D$1 -0.759 0.331 -2.293 GROUP$1 15.000 0.000 0.000 Latent Class Pattern 2 2 Thresholds A$1 -0.640 0.152 -4.218 B$1 0.269 0.137 1.955 C$1 0.929 0.214 4.338 D$1 1.727 0.229 7.552 GROUP$1 15.000 0.000 0.000 Categorical Latent Variables Means G#1 0.000 0.096 0.000 X#1 -0.888 0.245 -3.629