Mplus Textbook Examples
Applied Latent Class Analysis
Chapter
3 Latent Cluster Analysis by Jeroen K. Vermunt and Jay Magidson
Table 1 on page 100 using diabetes data. Notice that we only did the first three columns. Technically, the rest is the same.
Model 1:1 Class-dep unrestricted Σk with 1 cluster
Data: File is c:alcachapter3diabetes.dat ; Variable: Names are true glucose insulin sspg; Missing are all (-9999) ; Usev are glucose insulin sspg; classes = c(1); Analysis: Type = mixture ; model: %overall% glucose with insulin; glucose with sspg; insulin with sspg; TESTS OF MODEL FIT Loglikelihood H0 Value -2545.828 Information Criteria Number of Free Parameters 9 Akaike (AIC) 5109.655 Bayesian (BIC) 5136.446 Sample-Size Adjusted BIC 5107.967 (n* = (n + 2) / 24)
Model 2:1 Class-ind unrestricted Σk with 1 cluster
This is the same as Model 1:1 since there is only one cluster.
Model 3:1 Class-dep. diagonal Σk with 1 cluster
Data: File is c:alcachapter3diabetes.dat ; Variable: Names are true glucose insulin sspg; Missing are all (-9999) ; Usev are glucose insulin sspg; classes = c(1); Analysis: Type = mixture ; THE MODEL ESTIMATION TERMINATED NORMALLY TESTS OF MODEL FIT Loglikelihood H0 Value -2750.135 Information Criteria Number of Free Parameters 6 Akaike (AIC) 5512.269 Bayesian (BIC) 5530.130 Sample-Size Adjusted BIC 5511.144 (n* = (n + 2) / 24)
Model 4:1 Class-ind. diagonal Σk with 1 cluster
This is the same as Model 3:1 since there is only one cluster.
Model 5:1 class-dep Σk with only σ12 free with one cluster
Data: File is c:alcachapter3diabetes.dat ; Variable: Names are true glucose insulin sspg; Missing are all (-9999) ; Usev are glucose insulin sspg; classes = c(1); Analysis: Type = mixture ;
model: %overall% glucose with insulin;
TESTS OF MODEL FIT
Loglikelihood
H0 Value -2559.462
Information Criteria
Number of Free Parameters 7 Akaike (AIC) 5132.924 Bayesian (BIC) 5153.761 Sample-Size Adjusted BIC 5131.611 (n* = (n + 2) / 24)
Model 6:1 class-ind Σk with only σ12 free with one cluster
This is the same as Model 5:1 since there is only one cluster.
Model 1:2 Class-dep unrestricted Σk with 2 clusters
Data: File is c:alcachapter3diabetes.dat ; Variable: Names are true glucose insulin sspg; Missing are all (-9999) ; Usev are glucose insulin sspg; classes = c(2); Analysis: Type = mixture ; model: %overall% glucose with insulin sspg; insulin with sspg; %c#1% glucose-sspg; glucose with insulin sspg; insulin with sspg; TESTS OF MODEL FIT Loglikelihood H0 Value -2355.905 Information Criteria Number of Free Parameters 19 Akaike (AIC) 4749.811 Bayesian (BIC) 4806.369 Sample-Size Adjusted BIC 4746.246 (n* = (n + 2) / 24) Entropy 0.935
Model 2:2 Class-ind unrestricted Σk with 2 clusters
Data: File is c:alcachapter3diabetes.dat ; Variable: Names are true glucose insulin sspg; Missing are all (-9999) ; Usev are glucose insulin sspg; classes = c(2); Analysis: Type = mixture ; model: %overall% glucose with insulin sspg; insulin with sspg; TESTS OF MODEL FIT Loglikelihood H0 Value -2473.144 Information Criteria Number of Free Parameters 13 Akaike (AIC) 4972.288 Bayesian (BIC) 5010.985 Sample-Size Adjusted BIC 4969.849 (n* = (n + 2) / 24) Entropy 0.980
Model 3:2 Class-dep. diagonal Σk with 2 clusters
Data: File is c:alcachapter3diabetes.dat ; Variable: Names are true glucose insulin sspg; Missing are all (-9999) ; Usev are glucose insulin sspg; classes = c(2); Analysis: Type = mixture ; model: %overall% glucose-sspg; %c#1% glucose-sspg; TESTS OF MODEL FIT Loglikelihood H0 Value -2445.312 Information Criteria Number of Free Parameters 13 Akaike (AIC) 4916.624 Bayesian (BIC) 4955.322 Sample-Size Adjusted BIC 4914.185 (n* = (n + 2) / 24) Entropy 0.980
Model 4:2 Class-ind. diagonal Σk with 2 clusters
Data: File is c:alcachapter3diabetes.dat ; Variable: Names are true glucose insulin sspg; Missing are all (-9999) ; Usev are glucose insulin sspg; classes = c(2); Analysis: Type = mixture ; model: %overall% glucose-sspg; TESTS OF MODEL FIT Loglikelihood H0 Value -2559.816 Information Criteria Number of Free Parameters 10 Akaike (AIC) 5139.632 Bayesian (BIC) 5169.399 Sample-Size Adjusted BIC 5137.756 (n* = (n + 2) / 24) Entropy 0.994
Model 5:2 class-dep Σk with only σ12 free with 2 clusters
Data: File is c:alcachapter3diabetes.dat ; Variable: Names are true glucose insulin sspg; Missing are all (-9999) ; Usev are glucose insulin sspg; classes = c(2); Analysis: Type = mixture ; ! starts = 150 5; model: %overall% glucose insulin sspg; glucose with insulin; %c#1% glucose insulin sspg; glucose with insulin; TESTS OF MODEL FIT Loglikelihood H0 Value -2377.427 Information Criteria Number of Free Parameters 15 Akaike (AIC) 4784.853 Bayesian (BIC) 4829.504 Sample-Size Adjusted BIC 4782.039 (n* = (n + 2) / 24) Entropy 0.962
Model 6:2 class-ind Σk with only σ12 free with 2 clusters
Data: File is c:alcachapter3diabetes.dat ; Variable: Names are true glucose insulin sspg; Missing are all (-9999) ; Usev are glucose insulin sspg; classes = c(2); Analysis: Type = mixture ; model: %overall% glucose insulin sspg; glucose with insulin; TESTS OF MODEL FIT Loglikelihood H0 Value -2475.119 Information Criteria Number of Free Parameters 11 Akaike (AIC) 4972.237 Bayesian (BIC) 5004.981 Sample-Size Adjusted BIC 4970.173 (n* = (n + 2) / 24) Entropy 0.981
Model 1:3 Class-dep unrestricted Σk with 3 clusters
Data: File is c:alcachapter3diabetes.dat ; Variable: Names are true glucose insulin sspg; Missing are all (-9999) ; Usev are glucose insulin sspg; classes = c(3); Analysis: Type = mixture ; model: %overall% glucose insulin sspg; glucose with insulin; glucose with sspg; insulin with sspg; %c#1% glucose insulin sspg; glucose with insulin; glucose with sspg; insulin with sspg; %c#2% glucose insulin sspg; glucose with insulin; glucose with sspg; insulin with sspg; TESTS OF MODEL FIT Loglikelihood H0 Value -2303.492 Information Criteria Number of Free Parameters 29 Akaike (AIC) 4664.984 Bayesian (BIC) 4751.309 Sample-Size Adjusted BIC 4659.543 (n* = (n + 2) / 24) Entropy 0.855
Model 2:3 Class-ind unrestricted Σk with 3 clusters
Data: File is c:alcachapter3diabetes.dat ; Variable: Names are true glucose insulin sspg; Missing are all (-9999) ; Usev are glucose insulin sspg; classes = c(3); Analysis: Type = mixture ; model: %overall% glucose insulin sspg; glucose with insulin; glucose with sspg; insulin with sspg; TESTS OF MODEL FIT Loglikelihood H0 Value -2417.341 Information Criteria Number of Free Parameters 17 Akaike (AIC) 4868.683 Bayesian (BIC) 4919.287 Sample-Size Adjusted BIC 4865.493 (n* = (n + 2) / 24) Entropy 0.993
Model 3:3 Class-dep. diagonal Σk with 3 clusters
Data: File is c:alcachapter3diabetes.dat ; Variable: Names are true glucose insulin sspg; Missing are all (-9999) ; Usev are glucose insulin sspg; classes = c(3); Analysis: Type = mixture ; model: %overall% glucose insulin sspg; %c#1% glucose insulin sspg; %c#2% glucose insulin sspg; TESTS OF MODEL FIT Loglikelihood H0 Value -2364.137 Information Criteria Number of Free Parameters 20 Akaike (AIC) 4768.274 Bayesian (BIC) 4827.809 Sample-Size Adjusted BIC 4764.522 (n* = (n + 2) / 24) Entropy 0.902
Model 4:3 Class-ind. diagonal Σk with 3 clusters
Data: File is c:alcachapter3diabetes.dat ; Variable: Names are true glucose insulin sspg; Missing are all (-9999) ; Usev are glucose insulin sspg; classes = c(3); Analysis: Type = mixture ; model: %overall% glucose insulin sspg; TESTS OF MODEL FIT Loglikelihood H0 Value -2464.383 Information Criteria Number of Free Parameters 14 Akaike (AIC) 4956.766 Bayesian (BIC) 4998.440 Sample-Size Adjusted BIC 4954.139 (n* = (n + 2) / 24) Entropy 0.994
Model 5:3 class-dep Σk with only σ12 free with 3 clusters
Data: File is c:alcachapter3diabetes.dat ; Variable: Names are true glucose insulin sspg; Missing are all (-9999) ; Usev are glucose insulin sspg; classes = c(3); Analysis: Type = mixture ; model: %overall% glucose insulin sspg; glucose with insulin; %c#1% glucose insulin sspg; glucose with insulin; %c#2% glucose insulin sspg; glucose with insulin; TESTS OF MODEL FIT Loglikelihood H0 Value -2315.965 Information Criteria Number of Free Parameters 23 Akaike (AIC) 4677.930 Bayesian (BIC) 4746.395 Sample-Size Adjusted BIC 4673.615 (n* = (n + 2) / 24) Entropy 0.844
Model 6:3 class-ind Σk with only σ12 free with 3 clusters
Data: File is c:alcachapter3diabetes.dat ; Variable: Names are true glucose insulin sspg; Missing are all (-9999) ; Usev are glucose insulin sspg; classes = c(3); Analysis: Type = mixture ; ! starts = 150 2; model: %overall% glucose insulin sspg; glucose with insulin; TESTS OF MODEL FIT Loglikelihood H0 Value -2439.523 Information Criteria Number of Free Parameters 15 Akaike (AIC) 4909.046 Bayesian (BIC) 4953.697 Sample-Size Adjusted BIC 4906.232 (n* = (n + 2) / 24) Entropy 0.975
Table 2 on page 100 using model 5:3, a three-class model with class-dependent variance-covariance matrices and with only a local dependence between y1 and y2.
Data: File is c:alcachapter3diabetes.dat ; Variable: Names are true glucose insulin sspg; Missing are all (-9999) ; Usev are glucose insulin sspg; classes = c(3); Analysis: Type = mixture ; model: %overall% glucose insulin sspg; glucose with insulin; %c#1% glucose insulin sspg; glucose with insulin; %c#2% glucose insulin sspg; glucose with insulin;
FINAL CLASS COUNTS AND PROPORTIONS FOR THE LATENT CLASS PATTERNS BASED ON ESTIMATED POSTERIOR PROBABILITIES Latent Classes 1 75.34772 0.51964 2 41.21362 0.28423 3 28.43866 0.19613
MODEL RESULTS Estimates S.E. Est./S.E. Latent Class 1 GLUCOSE WITH INSULIN 70.840 47.839 1.481 Means GLUCOSE 90.956 0.987 92.109 INSULIN 356.212 7.055 50.493 SSPG 161.930 6.762 23.946 Variances GLUCOSE 57.299 10.546 5.433 INSULIN 1993.055 450.743 4.422 SSPG 2267.268 441.327 5.137 Latent Class 2 GLUCOSE WITH INSULIN 1185.211 451.959 2.622 Means GLUCOSE 103.193 2.496 41.342 INSULIN 487.963 21.674 22.514 SSPG 304.644 27.319 11.151 Variances GLUCOSE 175.814 50.943 3.451 INSULIN 13292.888 ******* 2.366 SSPG 22383.512 ******* 3.805 Latent Class 3 GLUCOSE WITH INSULIN 19226.834 ******* 5.990 Means GLUCOSE 231.436 16.011 14.455 INSULIN 1106.372 63.013 17.558 SSPG 78.448 9.764 8.035 Variances GLUCOSE 5246.220 836.749 6.270 INSULIN 78353.539 ******* 5.341 SSPG 2111.967 404.087 5.227 Categorical Latent Variables Means C#1 0.974 0.233 4.179 C#2 0.371 0.283 1.313
Table 3 on page 101. We output the class-membership to a data file called table3.dat in the following Mplus run. We then have to merge it back to the original data set and perform a crosstabulation between the class-membership based on the cluster analysis and the true membership in the original data set. This is done in Stata here.
Data: File is c:alcachapter3diabetes.dat ; Variable: Names are true glucose insulin sspg; Missing are all (-9999) ; Usev are glucose insulin sspg; classes = c(3); Analysis: Type = mixture ; model: %overall% glucose insulin sspg; glucose with insulin; %c#1% glucose insulin sspg; glucose with insulin; %c#2% glucose insulin sspg; glucose with insulin; savedata: file is c:alcachapter3table3.dat; save = cprob;
Stata Code for crosstabulation
Notice that we used the label command to label the three classes predicted by the model.
. infile glucose insulin sspg cprob1 cprob2 cprob3 c using table3.dat (145 observations read) . sort glucose insulin sspg . merge glucose insulin sspg using exam1
. label define g 1 "chemical" 2"normal" 3"overt" . label values c g . tab true c | c true | chemical normal overt | Total -----------+---------------------------------+---------- 1 | 8 27 1 | 36 2 | 71 5 0 | 76 3 | 0 5 28 | 33 -----------+---------------------------------+---------- Total | 79 37 29 | 145
Table 4 using prostate cancer data.
Model 1:1 local independence with 1 cluster
Data: File is c:alcachapter3https://stats.idre.ucla.edu/wp-content/uploads/2016/02/prostate.dat ; Variable: Names are y1 y2 y3 y4 y5 y6 y7 y8 y9 y10 y11 y12; Missing are all (-9999) ; usev are y1 - y12; classes = c(1); nominal are y3 y4 y7 y12; Analysis: Type = mixture ; model: %overall% TESTS OF MODEL FIT Loglikelihood H0 Value -15607.612 Information Criteria Number of Free Parameters 27 Akaike (AIC) 31269.223 Bayesian (BIC) 31381.633 Sample-Size Adjusted BIC 31295.939 (n* = (n + 2) / 24)
Model 2:1 Model 1:1 + σ56k
Data: File is c:alcachapter3https://stats.idre.ucla.edu/wp-content/uploads/2016/02/prostate.dat ; Variable: Names are y1 y2 y3 y4 y5 y6 y7 y8 y9 y10 y11 y12; Missing are all (-9999) ; usev are y1 - y12; classes = c(1); nominal are y3 y4 y7 y12; Analysis: Type = mixture ; ! starts = 100 10; model: %overall% y5 with y6; TESTS OF MODEL FIT Loglikelihood H0 Value -15488.042 Information Criteria Number of Free Parameters 28 Akaike (AIC) 31032.083 Bayesian (BIC) 31148.656 Sample-Size Adjusted BIC 31059.788 (n* = (n + 2) / 24)
Model 3:1 Model 2:1 + σ28k
Data: File is c:alcachapter3https://stats.idre.ucla.edu/wp-content/uploads/2016/02/prostate.dat ; Variable: Names are y1 y2 y3 y4 y5 y6 y7 y8 y9 y10 y11 y12; Missing are all (-9999) ; usev are y1 - y12; classes = c(1); nominal are y3 y4 y7 y12; Analysis: Type = mixture ; model: %overall% y5 with y6; y2 with y8; TESTS OF MODEL FIT Loglikelihood H0 Value -15471.263 Information Criteria Number of Free Parameters 29 Akaike (AIC) 31000.527 Bayesian (BIC) 31121.263 Sample-Size Adjusted BIC 31029.221 (n* = (n + 2) / 24)
Model 1:2 local independence with 2 clusters
Note: The code follows below does not work with Mplus 3.01. It works with Mplus 3.1. (Aug. 4, 2004).
Data: File is c:alcachapter3https://stats.idre.ucla.edu/wp-content/uploads/2016/02/prostate.dat ; Variable: Names are y1 y2 y3 y4 y5 y6 y7 y8 y9 y10 y11 y12; Missing are all (-9999) ; usev are y1 - y12; classes = c(2); nominal are y3 y4 y7 y12; Analysis: Type = mixture ; model: %overall% y1 y2 y5 y6 y8 y9 y10 y11; %c#1% y1 y2 y5 y6 y8 y9 y10 y11; output: tech1 ;
TESTS OF MODEL FIT Loglikelihood H0 Value -13836.484 Information Criteria Number of Free Parameters 95 Akaike (AIC) 27862.968 Bayesian (BIC) 28258.483 Sample-Size Adjusted BIC 27956.967 (n* = (n + 2) / 24) Entropy 0.971