This page shows an example of a latent growth curve model (LGCM) with footnotes explaining the output. A LGCM can be similar to a multilevel model (a model many people have seen). To help you understand the LGCM and its output, first a multilevel model is shown using HLM and then using Stata, and then the same data are analyzed using Mplus using a LGCM. The Mplus output is related to the multilevel model results. We suggest that you view this page using two web browsers so you can show the page side by side showing the Stata output in one browser and the corresponding Mplus output in the other browser.
This example is drawn from the Mplus User’s Guide (example 6.10) and we suggest that you see the Mplus User’s Guide for more details about this example. We thank the kind people at Muthén & Muthén for permission to use examples from their manual.
Example using HLM
Each subject is observed on the variable Y at four different times. A covariate called a is measured at each of the four time points. Also, a variable x1 and x2 are measured for each person. Conceptualized as a multilevel model, the variable time and a are level 1 variables. (Note that time is coded 0, 1, 2, and 3). The variables x1 and x2 are level two variables. The model uses time and a to predict the values of y at level 1, and uses x1 and x2 to predict the intercept and slope of time at level 2. We can write this model using multiple equations as shown below. This uses the ex610.mdm file.
Level-1 Model Y = B0 + B1*(A) + B2*(TIME) + R Level-2 Model B0 = G00 + G01*(X1) + G02*(X2) + U0 B1 = G10 B2 = G20 + G21*(X1) + G22*(X2) + U2
Here is the output from HLM, condensed to save space. Footnotes are included for relating the output to Mplus.
Sigma_squared = 0.54200I
Tau
INTRCPT1,B0 1.08757F 0.05079
TIME,B2 0.05079H 0.20495G
Tau (as correlations)
INTRCPT1,B0 1.000 0.108
TIME,B2 0.108 1.000
Final estimation of fixed effects:
----------------------------------------------------------------------------
Standard Approx.
Fixed Effect Coefficient Error T-ratio d.f. P-value
----------------------------------------------------------------------------
For INTRCPT1, B0
INTRCPT2, G00 0.570413A 0.054807 10.408 497 0.000
X1, G01 0.560548B 0.054574 10.271 497 0.000
X2, G02 0.716557B 0.055865 12.827 497 0.000
For A slope, B1
INTRCPT2, G10 0.296872E 0.021381 13.885 1993 0.000
For TIME slope, B2
INTRCPT2, G20 1.010207C 0.025332 39.879 497 0.000
X1, G21 0.263030D 0.025223 10.428 497 0.000
X2, G22 0.473419D 0.025819 18.336 497 0.000
----------------------------------------------------------------------------
Example using Stata
Combining the two equations into one by substituting the level 2 equation into the level 1 equation, we have the equation below, with the random effects identified by placing them in square brackets.
Composite model Y = G00 + G01*(X1) + G02*(X2) + G10*A + G20*TIME + G21*X1*TIME + G22*X2*TIME + [ U0 + U2*TIME + r ]
Based on the composite model, this is the same example using Stata. Please note that this is Stata 12 code.
infile y1-y4 x1 x2 a1-a4 using https://stats.idre.ucla.edu/stat/mplus/output/ex6.10.dat
generate id = _n
reshape long y a, i(id) j(time)
generate t1 = time - 1
mixed y a c.t1##c.x1 c.t1##c.x2 || id: t1, cov(un)
Mixed-effects ML regression Number of obs = 2000
Group variable: id Number of groups = 500
Obs per group: min = 4
avg = 4.0
max = 4
Wald chi2(6) = 2871.89
Log likelihood = -3075.8519 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
a | .2967777E .0213597 13.89 0.000 .2549135 .3386419
t1 | 1.010207C .0252504 40.01 0.000 .9607168 1.059696
x1 | .560547B .0544035 10.30 0.000 .4539181 .6671759
|
c.t1#c.x1 | .2630303D .0251425 10.46 0.000 .2137519 .3123087
|
t1 | 0 (omitted)
x2 | .716562B .0556897 12.87 0.000 .6074121 .8257118
|
c.t1#c.x2 | .4734171D .0257359 18.40 0.000 .4229756 .5238586
|
_cons | .570413A .0546356 10.44 0.000 .4633292 .6774968
------------------------------------------------------------------------------
------------------------------------------------------------------------------
Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval]
-----------------------------+------------------------------------------------
id: Unstructured |
var(time) | .20304G .0203019 .1669054 .2469978
var(_cons) | 1.178701F .1311813 .9476997 1.466008
cov(time,_cons) | -.1515308H .041814 -.2334848 -.0695768
-----------------------------+------------------------------------------------
var(Residual) | .5416011I .0242353 .4961241 .5912467
------------------------------------------------------------------------------
LR test vs. linear regression: chi2(3) = 1344.80 Prob > chi2 = 0.0000
Mplus example #1
Here is the same example analyzed as a Latent Growth Curve Model using Mplus based on the ex6.10.dat data file. We should reiterate that the multilevel model is not identical to the LGCM model, but only similar, so the results are analogous, not identical, but we use this as a means of helping you understand a technique and output below that might be new to you.
TITLE:
this is an example of a linear growth
model for a continuous outcome with time-
invariant and time-varying covariates
DATA:
FILE IS ex6.10.dat;
VARIABLE:
NAMES ARE y11-y14 x1 x2 a31-a34;
MODEL:
i s | y11@0 y12@1 y13@2 y14@3;
i s ON x1 x2;
y11 ON a31 ;
y12 ON a32 ;
y13 ON a33 ;
y14 ON a34 ;
SUMMARY OF ANALYSIS
Number of observations 500
TESTS OF MODEL FIT
Chi-Square Test of Model Fit
Value 25.786
Degrees of Freedom 21
P-Value 0.2147
Chi-Square Test of Model Fit for the Baseline Model
Value 2862.582
Degrees of Freedom 30
P-Value 0.0000
CFI/TLI
CFI 0.998
TLI 0.998
Loglikelihood
H0 Value -7255.873
H1 Value -7242.980
Information Criteria
Number of Free Parameters 17
Akaike (AIC) 14545.745
Bayesian (BIC) 14617.393
Sample-Size Adjusted BIC 14563.434
(n* = (n + 2) / 24)
RMSEA (Root Mean Square Error Of Approximation)
Estimate 0.021
90 Percent C.I. 0.000 0.046
Probability RMSEA <= .05 0.978
SRMR (Standardized Root Mean Square Residual)
Value 0.014
MODEL RESULTS
Estimates S.E. Est./S.E.
I |
Y11 1.000 0.000 0.000
Y12 1.000 0.000 0.000
Y13 1.000 0.000 0.000
Y14 1.000 0.000 0.000
S |
Y11 0.000 0.000 0.000
Y12 1.000 0.000 0.000
Y13 2.000 0.000 0.000
Y14 3.000 0.000 0.000
I ON
X1 0.557B 0.054 10.286
X2 0.718B 0.055 12.953
S ON
X1 0.264D 0.025 10.549
X2 0.473D 0.026 18.438
Y11 ON
A31 0.190E 0.044 4.302
Y12 ON
A32 0.323E 0.038 8.433
Y13 ON
A33 0.344E 0.038 9.016
Y14 ON
A34 0.303E 0.050 6.004
S WITH
I 0.055H 0.035 1.588
Intercepts
Y11 0.000 0.000 0.000
Y12 0.000 0.000 0.000
Y13 0.000 0.000 0.000
Y14 0.000 0.000 0.000
I 0.570A 0.054 10.477
S 1.010C 0.025 40.112
Residual Variances
Y11 0.509I 0.068 7.512
Y12 0.597I 0.048 12.348
Y13 0.481I 0.049 9.858
Y14 0.579I 0.088 6.607
I 1.074F 0.098 10.922
S 0.201G 0.022 9.092
- A. This analogous to G00 in the multilevel model. It is the predicted value of y when time and a are both 0.
- B. This analogous to G01 and G02 in the multilevel model. It is the predicted increase in the intercept for a one unit increase in x1 and x2, respectively.
- C. This is analogous to G20 in the multilevel model. It is the slope for time when x1 and x2 are held constant at 0.
- D. This analogous to G21 and G22 in the multilevel model. It is the predicted increase in the time slope for a one unit increase in x1 and x2, respectively.
- E. These are the four slopes representing the regression of y1 on a1, y2 on a2, y3 on a3, and y4 on a4. Note that in the multilevel model there is only one such relationship, whereas in this model there is a separate coefficient for each time point.
- F. This is the variance of the intercept, analogous to the variance component for the intercept in the multilevel model.
- G. This is the variance of the slope for time, analogous to the variance component for the intercept for time in the multilevel model.
- H. This is the covariance of the intercept and slope, analogous to the covariance of B0 and B1 from the multilevel model.
- I. This is the residual variance for each time point. Note that in the LGCM there is a separate residual variance at each time point. This is analogous to the rij value from the multilevel model. Note that in the multilevel model there is a single residual value.
Mplus example #2
Here is a second example which is a variation that uses constraints to make the assumptions more similar to the assumptions of the multilevel model. We should reiterate that the multilevel model is not identical to the LGCM model, but only similar, so the results are analogous, not identical, but we use this as a means of helping you understand a technique and output below that might be new to you.
TITLE:
this is an example of a linear growth
model for a continuous outcome with time-
invariant and time-varying covariates
DATA:
FILE IS ex6.10.dat;
VARIABLE:
NAMES ARE y11-y14 x1 x2 a31-a34;
MODEL:
i s | y11@0 y12@1 y13@2 y14@3;
i s ON x1 x2;
y11 ON a31 (1);
y12 ON a32 (1);
y13 ON a33 (1);
y14 ON a34 (1);
y11 y12 y13 y14 (2);
SUMMARY OF ANALYSIS
Number of observations 500
TESTS OF MODEL FIT
Loglikelihood
H0 Value -7261.105
H1 Value -7242.980
Information Criteria
Number of Free Parameters 11
Akaike (AIC) 14544.210
Bayesian (BIC) 14590.571
Sample-Size Adjusted BIC 14555.656
(n* = (n + 2) / 24)
MODEL RESULTS
Estimates S.E. Est./S.E.
I |
Y11 1.000 0.000 0.000
Y12 1.000 0.000 0.000
Y13 1.000 0.000 0.000
Y14 1.000 0.000 0.000
S |
Y11 0.000 0.000 0.000
Y12 1.000 0.000 0.000
Y13 2.000 0.000 0.000
Y14 3.000 0.000 0.000
I ON
X1 0.561B 0.054 10.303
X2 0.717B 0.056 12.867
S ON
X1 0.263D 0.025 10.462
X2 0.473D 0.026 18.395
Y11 ON
A31 0.297E 0.021 13.894
Y12 ON
A32 0.297E 0.021 13.894
Y13 ON
A33 0.297E 0.021 13.894
Y14 ON
A34 0.297E 0.021 13.894
S WITH
I 0.052I 0.031 1.641
Intercepts
Y11 0.000 0.000 0.000
Y12 0.000 0.000 0.000
Y13 0.000 0.000 0.000
Y14 0.000 0.000 0.000
I 0.570A 0.055 10.440
S 1.010C 0.025 40.008
Residual Variances
Y11 0.542I 0.024 22.361
Y12 0.542I 0.024 22.361
Y13 0.542I 0.024 22.361
Y14 0.542I 0.024 22.361
I 1.079F 0.094 11.506
S 0.203G 0.020 10.012
See the footnotes above for descriptions of the results. Exceptions are noted below.
- E. Note how the coefficients predicting y from a are all the same. Now they are closer to the multilevel model because the LGCM has been constrained to be more similar to the multilevel model.
- I. Note how the residual errors are the same. They are closer to (but not the same as) the multilevel model.
