Applied Latent Class Analysis, Chapter 3

Mplus Textbook Examples
Applied Latent Class Analysis
Chapter 3 Latent Cluster Analysis by Jeroen K. Vermunt and Jay Magidson

Table 1 on page 100 using diabetes data. Notice that we only did the first three columns. Technically, the rest is the same.

Model 1:1 Class-dep unrestricted Σ_k with 1 cluster

   Data:
      File is c:alcachapter3diabetes.dat ;
    Variable:
      Names are
        true glucose insulin sspg;
      Missing are all (-9999) ;
      Usev are glucose insulin sspg;
      classes = c(1);
    Analysis:
      Type = mixture ;

  model:
     %overall%
        glucose with insulin;
        glucose with sspg;
        insulin with sspg;

TESTS OF MODEL FIT

Loglikelihood

          H0 Value                       -2545.828

Information Criteria

          Number of Free Parameters              9
          Akaike (AIC)                    5109.655
          Bayesian (BIC)                  5136.446
          Sample-Size Adjusted BIC        5107.967
            (n* = (n + 2) / 24)

Model 2:1 Class-ind unrestricted Σ_k with 1 cluster

This is the same as Model 1:1 since there is only one cluster.

Model 3:1 Class-dep. diagonal Σ_k with 1 cluster

  Data:
      File is c:alcachapter3diabetes.dat ;
  Variable:
      Names are
        true glucose insulin sspg;
      Missing are all (-9999) ;
      Usev are glucose insulin sspg;
      classes = c(1);
   Analysis:
      Type = mixture ;


THE MODEL ESTIMATION TERMINATED NORMALLY

TESTS OF MODEL FIT

Loglikelihood

          H0 Value                       -2750.135

Information Criteria

          Number of Free Parameters              6
          Akaike (AIC)                    5512.269
          Bayesian (BIC)                  5530.130
          Sample-Size Adjusted BIC        5511.144
            (n* = (n + 2) / 24)

Model 4:1 Class-ind. diagonal Σ_k with 1 cluster

This is the same as Model 3:1 since there is only one cluster.

Model 5:1 class-dep Σ_k with only σ₁₂ free with one cluster

   Data:
      File is c:alcachapter3diabetes.dat ;
    Variable:
      Names are
        true glucose insulin sspg;
      Missing are all (-9999) ;
      Usev are glucose insulin sspg;
      classes = c(1);
    Analysis:
      Type = mixture ;

  model:
     %overall%
     glucose with insulin;

TESTS OF MODEL FIT

Loglikelihood

          H0 Value                       -2559.462

Information Criteria

          Number of Free Parameters              7
          Akaike (AIC)                    5132.924
          Bayesian (BIC)                  5153.761
          Sample-Size Adjusted BIC        5131.611
            (n* = (n + 2) / 24)

Model 6:1 class-ind Σ_k with only σ₁₂ free with one cluster

This is the same as Model 5:1 since there is only one cluster.

Model 1:2 Class-dep unrestricted Σ_k with 2 clusters

   Data:
      File is c:alcachapter3diabetes.dat ;
    Variable:
      Names are
        true glucose insulin sspg;
      Missing are all (-9999) ;
      Usev are glucose insulin sspg;
      classes = c(2);
    Analysis:
      Type = mixture ;

  model:
          %overall%
          glucose with insulin sspg;
          insulin with sspg;
          %c#1%
          glucose-sspg;
          glucose with insulin sspg;
          insulin with sspg;

TESTS OF MODEL FIT

Loglikelihood

          H0 Value                       -2355.905

Information Criteria

          Number of Free Parameters             19
          Akaike (AIC)                    4749.811
          Bayesian (BIC)                  4806.369
          Sample-Size Adjusted BIC        4746.246
            (n* = (n + 2) / 24)
          Entropy                            0.935

Model 2:2 Class-ind unrestricted Σ_k with 2 clusters

  Data:
      File is c:alcachapter3diabetes.dat ;
    Variable:
      Names are
        true glucose insulin sspg;
      Missing are all (-9999) ;
      Usev are glucose insulin sspg;
      classes = c(2);
    Analysis:
      Type = mixture ;

  model:
          %overall%
          glucose with insulin sspg;
          insulin with sspg;

TESTS OF MODEL FIT

Loglikelihood

          H0 Value                       -2473.144

Information Criteria

          Number of Free Parameters             13
          Akaike (AIC)                    4972.288
          Bayesian (BIC)                  5010.985
          Sample-Size Adjusted BIC        4969.849
            (n* = (n + 2) / 24)
          Entropy                            0.980

Model 3:2 Class-dep. diagonal Σ_k with 2 clusters

   Data:
      File is c:alcachapter3diabetes.dat ;
    Variable:
      Names are
        true glucose insulin sspg;
      Missing are all (-9999) ;
      Usev are glucose insulin sspg;
      classes = c(2);
    Analysis:
      Type = mixture ;

  model:
          %overall%
             glucose-sspg;
          %c#1%
          glucose-sspg;

TESTS OF MODEL FIT

Loglikelihood

          H0 Value                       -2445.312

Information Criteria

          Number of Free Parameters             13
          Akaike (AIC)                    4916.624
          Bayesian (BIC)                  4955.322
          Sample-Size Adjusted BIC        4914.185
            (n* = (n + 2) / 24)
          Entropy                            0.980

Model 4:2 Class-ind. diagonal Σ_k with 2 clusters

   Data:
      File is c:alcachapter3diabetes.dat ;
    Variable:
      Names are
        true glucose insulin sspg;
      Missing are all (-9999) ;
      Usev are glucose insulin sspg;
      classes = c(2);
    Analysis:
      Type = mixture ;

  model:
          %overall%
             glucose-sspg;

TESTS OF MODEL FIT

Loglikelihood

          H0 Value                       -2559.816

Information Criteria

          Number of Free Parameters             10
          Akaike (AIC)                    5139.632
          Bayesian (BIC)                  5169.399
          Sample-Size Adjusted BIC        5137.756
            (n* = (n + 2) / 24)
          Entropy                            0.994

Model 5:2 class-dep Σ_k with only σ₁₂ free with 2 clusters

   Data:
      File is c:alcachapter3diabetes.dat ;
    Variable:
      Names are
        true glucose insulin sspg;
      Missing are all (-9999) ;
      Usev are glucose insulin sspg;
      classes = c(2);
    Analysis:
      Type = mixture ;
   !   starts = 150 5;

   model:
       %overall%
       glucose insulin sspg;
       glucose with insulin;
       %c#1%
       glucose insulin sspg;
       glucose with insulin;

TESTS OF MODEL FIT

Loglikelihood

          H0 Value                       -2377.427

Information Criteria

          Number of Free Parameters             15
          Akaike (AIC)                    4784.853
          Bayesian (BIC)                  4829.504
          Sample-Size Adjusted BIC        4782.039
            (n* = (n + 2) / 24)
          Entropy                            0.962

Model 6:2 class-ind Σ_k with only σ₁₂ free with 2 clusters

   Data:
      File is c:alcachapter3diabetes.dat ;
    Variable:
      Names are
        true glucose insulin sspg;
      Missing are all (-9999) ;
      Usev are glucose insulin sspg;
      classes = c(2);
    Analysis:
      Type = mixture ;

   model:
       %overall%
       glucose insulin sspg;
       glucose with insulin;

TESTS OF MODEL FIT

Loglikelihood

          H0 Value                       -2475.119

Information Criteria

          Number of Free Parameters             11
          Akaike (AIC)                    4972.237
          Bayesian (BIC)                  5004.981
          Sample-Size Adjusted BIC        4970.173
            (n* = (n + 2) / 24)
          Entropy                            0.981

Model 1:3 Class-dep unrestricted Σ_k with 3 clusters

   Data:
      File is c:alcachapter3diabetes.dat ;
    Variable:
      Names are
        true glucose insulin sspg;
      Missing are all (-9999) ;
      Usev are glucose insulin sspg;
      classes = c(3);
    Analysis:
      Type = mixture ;

   model:
       %overall%
         glucose insulin sspg;
         glucose with insulin;
         glucose with sspg;
         insulin with sspg;
       %c#1%
         glucose insulin sspg;
         glucose with insulin;
         glucose with sspg;
         insulin with sspg;
       %c#2%
         glucose insulin sspg;
         glucose with insulin;
         glucose with sspg;
         insulin with sspg;


TESTS OF MODEL FIT

Loglikelihood

          H0 Value                       -2303.492

Information Criteria

          Number of Free Parameters             29
          Akaike (AIC)                    4664.984
          Bayesian (BIC)                  4751.309
          Sample-Size Adjusted BIC        4659.543
            (n* = (n + 2) / 24)
          Entropy                            0.855

Model 2:3 Class-ind unrestricted Σ_k with 3 clusters

   Data:
      File is c:alcachapter3diabetes.dat ;
    Variable:
      Names are
        true glucose insulin sspg;
      Missing are all (-9999) ;
      Usev are glucose insulin sspg;
      classes = c(3);
    Analysis:
      Type = mixture ;

   model:
       %overall%
         glucose insulin sspg;
         glucose with insulin;
         glucose with sspg;
         insulin with sspg;


TESTS OF MODEL FIT

Loglikelihood

          H0 Value                       -2417.341

Information Criteria

          Number of Free Parameters             17
          Akaike (AIC)                    4868.683
          Bayesian (BIC)                  4919.287
          Sample-Size Adjusted BIC        4865.493
            (n* = (n + 2) / 24)
          Entropy                            0.993

Model 3:3 Class-dep. diagonal Σ_k with 3 clusters

   Data:
      File is c:alcachapter3diabetes.dat ;
    Variable:
      Names are
        true glucose insulin sspg;
      Missing are all (-9999) ;
      Usev are glucose insulin sspg;
      classes = c(3);
    Analysis:
      Type = mixture ;

   model:
       %overall%
       glucose insulin sspg;
       %c#1%
       glucose insulin sspg;
       %c#2%
       glucose insulin sspg;

TESTS OF MODEL FIT

Loglikelihood

          H0 Value                       -2364.137

Information Criteria

          Number of Free Parameters             20
          Akaike (AIC)                    4768.274
          Bayesian (BIC)                  4827.809
          Sample-Size Adjusted BIC        4764.522
            (n* = (n + 2) / 24)
          Entropy                            0.902

Model 4:3 Class-ind. diagonal Σ_k with 3 clusters

   Data:
      File is c:alcachapter3diabetes.dat ;
    Variable:
      Names are
        true glucose insulin sspg;
      Missing are all (-9999) ;
      Usev are glucose insulin sspg;
      classes = c(3);
    Analysis:
      Type = mixture ;

   model:
       %overall%
       glucose insulin sspg;

TESTS OF MODEL FIT

Loglikelihood

          H0 Value                       -2464.383

Information Criteria

          Number of Free Parameters             14
          Akaike (AIC)                    4956.766
          Bayesian (BIC)                  4998.440
          Sample-Size Adjusted BIC        4954.139
            (n* = (n + 2) / 24)
          Entropy                            0.994

Model 5:3 class-dep Σ_k with only σ₁₂ free with 3 clusters

   Data:
      File is c:alcachapter3diabetes.dat ;
    Variable:
      Names are
        true glucose insulin sspg;
      Missing are all (-9999) ;
      Usev are glucose insulin sspg;
      classes = c(3);
    Analysis:
      Type = mixture ;

   model:
       %overall%
       glucose insulin sspg;
       glucose with insulin;
       %c#1%
       glucose insulin sspg;
       glucose with insulin;
       %c#2%
       glucose insulin sspg;
       glucose with insulin;

TESTS OF MODEL FIT

Loglikelihood

          H0 Value                       -2315.965

Information Criteria

          Number of Free Parameters             23
          Akaike (AIC)                    4677.930
          Bayesian (BIC)                  4746.395
          Sample-Size Adjusted BIC        4673.615
            (n* = (n + 2) / 24)
          Entropy                            0.844

Model 6:3 class-ind Σ_k with only σ₁₂ free with 3 clusters

   Data:
      File is c:alcachapter3diabetes.dat ;
    Variable:
      Names are
        true glucose insulin sspg;
      Missing are all (-9999) ;
      Usev are glucose insulin sspg;
      classes = c(3);
    Analysis:
      Type = mixture ;
    !  starts = 150 2;

   model:
       %overall%
       glucose insulin sspg;
       glucose with insulin;

TESTS OF MODEL FIT

Loglikelihood

          H0 Value                       -2439.523

Information Criteria

          Number of Free Parameters             15
          Akaike (AIC)                    4909.046
          Bayesian (BIC)                  4953.697
          Sample-Size Adjusted BIC        4906.232
            (n* = (n + 2) / 24)
          Entropy                            0.975

Table 2 on page 100 using model 5:3, a three-class model with class-dependent variance-covariance matrices and with only a local dependence between y1 and y2.

   Data:
      File is c:alcachapter3diabetes.dat ;
    Variable:
      Names are
        true glucose insulin sspg;
      Missing are all (-9999) ;
      Usev are glucose insulin sspg;
      classes = c(3);
    Analysis:
      Type = mixture ;
 
    model:
         %overall%
         glucose insulin sspg;
         glucose with insulin;
         %c#1%
         glucose insulin sspg;
         glucose with insulin;
         %c#2%
         glucose insulin sspg;
         glucose with insulin;

FINAL CLASS COUNTS AND PROPORTIONS FOR THE LATENT CLASS PATTERNS
BASED ON ESTIMATED POSTERIOR PROBABILITIES

    Latent
   Classes

       1         75.34772          0.51964
       2         41.21362          0.28423
       3         28.43866          0.19613

MODEL RESULTS

                   Estimates     S.E.  Est./S.E.

Latent Class 1

 GLUCOSE  WITH
    INSULIN           70.840   47.839      1.481

 Means
    GLUCOSE           90.956    0.987     92.109
    INSULIN          356.212    7.055     50.493
    SSPG             161.930    6.762     23.946

 Variances
    GLUCOSE           57.299   10.546      5.433
    INSULIN         1993.055  450.743      4.422
    SSPG            2267.268  441.327      5.137

Latent Class 2

 GLUCOSE  WITH
    INSULIN         1185.211  451.959      2.622

 Means
    GLUCOSE          103.193    2.496     41.342
    INSULIN          487.963   21.674     22.514
    SSPG             304.644   27.319     11.151

 Variances
    GLUCOSE          175.814   50.943      3.451
    INSULIN        13292.888  *******      2.366
    SSPG           22383.512  *******      3.805

Latent Class 3

 GLUCOSE  WITH
    INSULIN        19226.834  *******      5.990

 Means
    GLUCOSE          231.436   16.011     14.455
    INSULIN         1106.372   63.013     17.558
    SSPG              78.448    9.764      8.035

 Variances
    GLUCOSE         5246.220  836.749      6.270
    INSULIN        78353.539  *******      5.341
    SSPG            2111.967  404.087      5.227

Categorical Latent Variables

 Means
    C#1                0.974    0.233      4.179
    C#2                0.371    0.283      1.313

Table 3 on page 101. We output the class-membership to a data file called table3.dat in the following Mplus run. We then have to merge it back to the original data set and perform a crosstabulation between the class-membership based on the cluster analysis and the true membership in the original data set. This is done in Stata here.

  Data:
      File is c:alcachapter3diabetes.dat ;
    Variable:
      Names are
        true glucose insulin sspg;
      Missing are all (-9999) ;
      Usev are glucose insulin sspg;
      classes = c(3);
    Analysis:
      Type = mixture ;
 
    model:
         %overall%
         glucose insulin sspg;
         glucose with insulin;
         %c#1%
         glucose insulin sspg;
         glucose with insulin;
         %c#2%
         glucose insulin sspg;
         glucose with insulin;
  savedata:
    file is c:alcachapter3table3.dat;
    save = cprob;

Stata Code for crosstabulation

Notice that we used the label command to label the three classes predicted by the model.

. infile glucose insulin sspg cprob1 cprob2 cprob3 c using table3.dat
(145 observations read)

. sort glucose insulin sspg

. merge glucose insulin sspg using exam1

. label define g 1 "chemical" 2"normal" 3"overt"

. label values c g

. tab true c

           |                c
      true |  chemical     normal      overt |     Total
-----------+---------------------------------+----------
         1 |         8         27          1 |        36 
         2 |        71          5          0 |        76 
         3 |         0          5         28 |        33 
-----------+---------------------------------+----------
     Total |        79         37         29 |       145

Table 4 using prostate cancer data.

Model 1:1 local independence with 1 cluster

  Data:
    File is c:alcachapter3https://stats.idre.ucla.edu/wp-content/uploads/2016/02/prostate.dat ;
  Variable:
    Names are
       y1 y2 y3 y4 y5 y6 y7 y8 y9 y10 y11 y12;
    Missing are all (-9999) ;
    usev are y1 - y12;
    classes = c(1);
    nominal are y3 y4 y7 y12;

  Analysis:
    Type = mixture ;
  model:
         %overall%


TESTS OF MODEL FIT

Loglikelihood

          H0 Value                      -15607.612

Information Criteria

          Number of Free Parameters             27
          Akaike (AIC)                   31269.223
          Bayesian (BIC)                 31381.633
          Sample-Size Adjusted BIC       31295.939
            (n* = (n + 2) / 24)

Model 2:1 Model 1:1 + σ_56k

  Data:
    File is c:alcachapter3https://stats.idre.ucla.edu/wp-content/uploads/2016/02/prostate.dat ;
  Variable:
    Names are
       y1 y2 y3 y4 y5 y6 y7 y8 y9 y10 y11 y12;
    Missing are all (-9999) ;
    usev are y1 - y12;
    classes = c(1);
    nominal are y3 y4 y7 y12;

  Analysis:
    Type = mixture ;
   ! starts = 100 10;
  model:
        %overall%
        y5 with y6;

TESTS OF MODEL FIT

Loglikelihood

          H0 Value                      -15488.042

Information Criteria

          Number of Free Parameters             28
          Akaike (AIC)                   31032.083
          Bayesian (BIC)                 31148.656
          Sample-Size Adjusted BIC       31059.788
            (n* = (n + 2) / 24)

Model 3:1 Model 2:1 + σ_28k

  Data:
    File is c:alcachapter3https://stats.idre.ucla.edu/wp-content/uploads/2016/02/prostate.dat ;
  Variable:
    Names are
       y1 y2 y3 y4 y5 y6 y7 y8 y9 y10 y11 y12;
    Missing are all (-9999) ;
    usev are y1 - y12;
    classes = c(1);
    nominal are y3 y4 y7 y12;

  Analysis:
    Type = mixture ;
  model:
        %overall%
        y5 with y6;
        y2 with y8;

TESTS OF MODEL FIT

Loglikelihood

          H0 Value                      -15471.263

Information Criteria

          Number of Free Parameters             29
          Akaike (AIC)                   31000.527
          Bayesian (BIC)                 31121.263
          Sample-Size Adjusted BIC       31029.221
            (n* = (n + 2) / 24)

Model 1:2 local independence with 2 clusters

Note: The code follows below does not work with Mplus 3.01. It works with Mplus 3.1. (Aug. 4, 2004).

  Data:
    File is c:alcachapter3https://stats.idre.ucla.edu/wp-content/uploads/2016/02/prostate.dat ;
  Variable:
    Names are
       y1 y2 y3 y4 y5 y6 y7 y8 y9 y10 y11 y12;
    Missing are all (-9999) ;
    usev are y1 - y12;
    classes = c(2);
    nominal are y3 y4 y7 y12;

  Analysis:
    Type = mixture ;
  model:
         %overall%
         y1 y2 y5 y6 y8 y9 y10 y11;

         %c#1%
         y1 y2 y5 y6 y8 y9 y10 y11;

  output: tech1 ;

TESTS OF MODEL FIT

Loglikelihood

          H0 Value                      -13836.484

Information Criteria

          Number of Free Parameters             95
          Akaike (AIC)                   27862.968
          Bayesian (BIC)                 28258.483
          Sample-Size Adjusted BIC       27956.967
            (n* = (n + 2) / 24)
          Entropy                            0.971

Mplus Textbook Examples Applied Latent Class Analysis Chapter 3 Latent Cluster Analysis by Jeroen K. Vermunt and Jay Magidson

Mplus Textbook Examples
Applied Latent Class Analysis
Chapter 3 Latent Cluster Analysis by Jeroen K. Vermunt and Jay Magidson