How can I run a growth model in wide form with unequally spaced time points (tscore)?

This page was created using Mplus version 5.2, the output and/or syntax may be different for other versions of Mplus.

Sometimes repeated measures data include measurements at unequal time points. For example, in a medical study, instead of all patients returning for follow-up at 2, 6, 12, and 24 months, there might have been variation in follow-up gaps, one patient might have returned at 1, 5, 7, and 20 months, while another returned at 3, 4, 8, and 12 months, and another may have returned at 2, 5, and 12 months (only three measurement occasions). One common method of analyzing such data is to analyze the data using a mixed model when the data is structured in long format. However, there may be cases where we would want to run similar analysis using the same data in wide format, for example if one wishes to run a parallel process model. Below we demonstrate how to run such models in wide form. We will start with a relatively simple model and built up to a more complex model. This is mainly for demonstration purposes, yet building up a model in steps is a good practice in general.

The dataset for this example includes 1000 cases, each with 5 measurement occasions (labeled t0–t4). In this case, t0 is equal to zero in all cases, but the first time point could also be unequal, for example, if the time points were children’s ages, and the age at which children entered the study varied. The outcome variables are y0–y4 (one for each measurement occasion), the values of the time varying covariate (predictor) are in the variables a0–a4, and the time invariant covariate (predictor) is the variable x. The variables t0-t4 give the value of time at each of the measurement occasion. The dataset can be downloaded by clicking here: https://stats.idre.ucla.edu/wp-content/uploads/2016/02/tscore.dat.

In addition to the Mplus input, we have included Stata commands for equivalent models. The Stata commands assume that the data is in long form, and make use of an additional variable, timepoint, calculated using the syntax below.

sort id t
by id: gen timepoint = _n

Modeling the outcome variable over time (intercept and slope for time)

The usevariables option of the variables command lists the variables used in the model, since not all of the variables in the dataset (named in the names option) are used in the current model. The tscores option of the variable command lists the variables giving the value of time at each measurement occasion. The type=random option of the analysis command allows for the estimation of the random intercept and slope. In the model command, the i and s followed by the vertical bar (i.e. | ) specify a random intercept (i) and slope (s) . The vertical bar is followed by the list of outcome variables (y0-y4) followed by the at keyword and a list of the time variables (t0-t4) which give the value of time at each measurement occasion. This tells Mplus that y0 is the value of the outcome measured when time is equal to the value given in t0, and so on. Note that the names of the random effects are arbitrary, the type of random effect is determined by the number and order of the terms. The first term (in this case i) specifies the random intercept, the second (in this case s) a random slope, a third term a quadratic (squared) term, and a fourth an additional quadratic term (cubed). The number of terms determines the type of model run, since we have specified two terms, the model will include a random slope and intercept, if we had included only one term, the model would contain only a random intercept, had we included a third term, the model would have included a squared term for growth.

Data:
    file is https://stats.idre.ucla.edu/wp-content/uploads/2016/02/tscore.dat ;
Variable: 
    names are id t0 t1 t2 t3 t4 a0 a1 a2 a3 a4 y0 y1 y2 y3 y4 x;
    usevariables are t0 t1 t2 t3 t4 y0 y1 y2 y3 y4;
    tscores = t0 t1 t2 t3 t4;
Analysis:
    type = random;
Model:
    i s | y0 y1 y2 y3 y4 AT t0 t1 t2 t3 t4;

The same model can be run in Stata using the command:

xtmixed y t || id: t, var cov(un) resid(ind, by(timepoint)) ml

Adding a time invariant covariate

The model shown below is similar to the example above except for two changes. First, in the model command, the line i s on x; indicates that the time invariant covariate x should be used to predict the intercept and slope for growth. Second, we have simplified the syntax by listing sets of variables with dashes rather than writing out the full variable names (e.g. y0-y4, rather than y0 y1…).

Data:
    file is https://stats.idre.ucla.edu/wp-content/uploads/2016/02/tscore.dat ;
Variable:
    names are id t0-t4 a0-a4 y0-y4 x;
    usevariables are t0-t4 y0-y4 x; 
    tscores = t0-t4;
Analysis: 
    type = random ;
Model:
    i s | y0-y4 at t0-t4;
    i s on x;

The same model can be run in Stata using the syntax:

xtmixed y t x c.x#c.t || id: t, var cov(un) resid(ind, by(timepoint)) ml

Adding time varying covariates

In this model we use the time varying covariates a0-a4 to predict the outcome at each time period (y0-y4). For example the line y0 on a0; uses the variable a0 to predict y0. Note that in this model, the relationship (slope) for y on a is allowed to be different at each time point.

Data:
    file is https://stats.idre.ucla.edu/wp-content/uploads/2016/02/tscore.dat ;
Variable:
    Names are id t0-t4 a0-a4 y0-y4 x;
    usevariables are t0-t4 y0-y4 x a0-a4;
    tscores = t0-t4;
Analysis:
    type = random ;
Model:
    i s | y0-y4 AT t0-t4;
    y0 on a0;
    y1 on a1;
    y2 on a2;
    y3 on a3;
    y4 on a4;
    i s on x;

The equivalent command in Stata is:

xtmixed y t x c.x#c.t c.a#i.timepoint || id: t, var cov(un) resid(ind, by(timepoint)) ml

Time varying covariates with fixed slopes

Although it’s not necessarily the next step in building up our model, you might also want to estimate a model where the relationship between a and y is constant across time. In this model we fix the slope for the time varying covariates a0-a4 predicting the outcome at each time period (y0-y4).

Data:
    file is https://stats.idre.ucla.edu/wp-content/uploads/2016/02/tscore.dat ;
Variable:
    Names are id t0-t4 a0-a4 y0-y4 x;
    usevariables are t0-t4 y0-y4 x a0-a4;
    tscores = t0-t4;
Analysis:
    type = random ;
Model:
    i s | y0-y4 AT t0-t4;
    y0 on a0 (1);
    y1 on a1 (1);
    y2 on a2 (1);
    y3 on a3 (1);
    y4 on a4 (1);
    i s on x;

The equivalent command in Stata is:

xtmixed y t x c.x#c.t a || id: t, var cov(un) resid(ind, by(timepoint)) ml

Adding random slopes for the time varying covariates

This model includes a random effect for the time varying covariates (a0-a4) predicting the outcome at each time (y0-y4). In the model command, the random slope is indicated by the name of the random effect (in this case st) followed by a vertical bar (i.e. | ) preceding the desired regression. For example, the line st | y0 on a0; indicates that the slope of a0 predicting y0 is random. Looking at the last line of the model command (i.e. i s st on x;) you can see that we have used the time invariant covariate x to predict the random effect st, along with the random intercept and slope for time (i and s). We also model the correlation structure among the random effects.

Data:
    file is https://stats.idre.ucla.edu/wp-content/uploads/2016/02/tscore.dat;
Variable:
    names are id t0-t4 a0-a4 y0-y4 x;
    usevariables are t0-t4 y0-y4 x a0-a4;
    tscores = t0-t4;
Analysis:
    type = random ;
Model:
    i s | y0-y4 AT t0-t4;
    st | y0 on a0;
    st | y1 on a1;
    st | y2 on a2;
    st | y3 on a3;
    st | y4 on a4;
    i s st on x;
    i with s st;
    s with st;

The equivalent command in Stata is:

xtmixed y t x c.x#c.t a c.x#c.a || id: t a, var  cov(un) resid(ind, by(timepoint)) ml