CFA using Mata optimize | Stata Code Fragments

This code fragment page shows an example using Mata optimize to estimate a confirmatory factor analysis model. The input for the program is the covariance matrix shown below which was obtained from a sample of 200 observations.

         1       2       3       4       5       6
    +-------------------------------------------------+
  1 |  1.032                                          |
  2 |   .473    .915                                  |
  3 |   .492    .528   1.017                          |
  4 |   .269    .194    .235   1.062                  |
  5 |    .18    .252    .249    .472   1.097          |
  6 |   .128     .18    .166     .38    .373   1.126  |
    +-------------------------------------------------+

The code fragment below uses combination of Stata and Mata commands for a two factor CFA model with variables 1 through 3 defining Factor 1 and variables 4 through 6 defining on Factor 2. For this example, we will estimate all six loadings and fix the factor variances at one. In the Mata code, the 13 element vector p contains the parameter values being estimated. The following table describes what each of the elements of p are used for using a quasi-LISREL notation.

p[ 1] - lambda-x[1,1] - variable 1 on factor 1
p[ 2] - lambda-x[2,1] - variable 2 on factor 1
p[ 3] - lambda-x[3,1] - variable 3 on factor 1
p[ 4] - lambda-x[4,2] - variable 4 on factor 2
p[ 5] - lambda-x[5,2] - variable 5 on factor 2
p[ 6] - lambda-x[6,2] - variable 6 on factor 2
p[ 7] - phi[2,1] - correlation between factors 1 and 2
p[ 8] - theta-delta[1,1] - unique factor variance for variable 1
p[ 9] - theta-delta[2,2] - unique factor variance for variable 2
p[10] - theta-delta[3,3] - unique factor variance for variable 3
p[11] - theta-delta[4,4] - unique factor variance for variable 4
p[12] - theta-delta[5,5] - unique factor variance for variable 5
p[13] - theta-delta[6,6] - unique factor variance for variable 6

The code fragment starts off by defining a Mata function called myeval. The last line of myeval gives the discrepancy function which is being optimized (in this case minimized). Following the function definition the covariance matrix is entered as a regular Stata matrix. The program then goes into Mata to call the various optimize commands and options before ending with a series of output commands in Mata.

mata: mata clear  
mata: mata set matastrict off 
mata:
void myeval(todo, p, s, lndetsn, fml, g, H)
{
  l=J(6,2,0)
  l[1..3,1]=p[1..3]'
  l[4..6,2]=p[4..6]'
  phi=I(2)
  phi[2,1]=p[7]
  phi[1,2]=p[7] 
  t=diag(p[8..13])
  sigma = l*phi*l' + t
  fml=log(det(sigma))+trace(invsym(sigma)*s)-lndetsn /* lndetsn = log(det(s))-n */
}
end

/* sample covariance matrix */
mat s = (1.032, .,     .,     .,     .,     .  ///
          .473, .915,  .,     .,     .,     .  ///
          .492, .528, 1.017,  .,     .,     .  /// 
          .269, .194,  .235, 1.062,  .,     .  /// 
          .180, .252,  .249,  .472, 1.097,  .  /// 
          .128, .180,  .166,  .380,  .373, 1.126)

mata:
s=st_matrix("s")
_makesymmetric(s)
n=cols(s)
N=200
lndetsn=log(det(s))+n  /* calculate once */
S=optimize_init()
optimize_init_which(S, "min")
optimize_init_evaluator(S, &myeval())
optimize_init_evaluatortype(S,"d0")
optimize_init_params(S, J(1,13,0.5))
optimize_init_argument(S, 1, s)
optimize_init_argument(S, 2, lndetsn)
p=optimize(S)
v=optimize_result_V(S) 
v=diagonal(sqrt(diag(v)))'/sqrt(N/2)
mff=optimize_result_value(S)

/* output display */
p[1..3]',v[1..3]'      /* lambda-x ksi 1 - se */
p[4..6]',v[4..6]'      /* lambda-x ksi 2 - se */
p[7],v[7]'             /* phi 2,1 - se       */
p[8..13]v[8..13]      /* theta-delta / se   */
1 :- (p[8..13] :/ diagonal(diag(s))') /* r-squared */

printf("Goodness of Fit Statistics n")
df = n*(n+1)/2-cols(p)
chi2=N*mff
printf("degrees of freedom         = %9.0f n", df)
printf("chi-square                 = %9.4f n", chi2)
printf("p-value                    = %9.4f n", chi2tail(df,chi2))
printf("minimum fit function value = %9.4f n", mff)
end

The above program produces the following edited output.

Iteration 0:  f(p) =  .39482661
Iteration 1:  f(p) =  .06234212  (not concave)
Iteration 2:  f(p) =  .04705071  (not concave)
Iteration 3:  f(p) =  .03954061  (not concave)
Iteration 4:  f(p) =  .03362923  (not concave)
Iteration 5:  f(p) =  .02607552  (not concave)
Iteration 6:  f(p) =  .02361655  (not concave)
Iteration 7:  f(p) =    .023476
Iteration 8:  f(p) =  .02265382
Iteration 9:  f(p) =  .02251722  (not concave)
Iteration 10:  f(p) =  .02246136
Iteration 11:  f(p) =  .02235127
Iteration 12:  f(p) =  .02234181
Iteration 13:  f(p) =  .02234179

/* lambda-x ksi 1 - se */
                 1             2
    +-----------------------------+
  1 |  .6650460974   .0742902909  |
  2 |   .710899843   .0697524573  |
  3 |  .7417516958    .073515853  |
    +-----------------------------+

/* lambda-x ksi 2 - se */
                 1             2
    +-----------------------------+
  1 |   .691928814   .0887623678  |
  2 |  .6884025603   .0898251907  |
  3 |  .5401145259   .0874296577  |
    +-----------------------------+

/* phi 2,1 - se       */
                 1             2
    +-----------------------------+
  1 |  .4606859412   .0867052696  |
    +-----------------------------+

/* theta-delta / se   */
                 1             2             3             4             5             6
    +-------------------------------------------------------------------------------------+
  1 |  .5897136906   .4096214137   .4668044224   .5832345176    .623101863    .834276299  |
  2 |  .0779038385   .0694190998   .0768735571   .1030198761   .1050035395    .100803083  |
    +-------------------------------------------------------------------------------------+

/* r-squared */
                 1             2             3             4             5             6
    +-------------------------------------------------------------------------------------+
  1 |  .4285720052   .5523263239   .5409986014   .4508149552   .4319946554   .2590796634  |
    +-------------------------------------------------------------------------------------+

Goodness of Fit Statistics 
chi-square                 =    4.4684 
p-value                    =    0.8126 
minimum fit function value =    0.0223