In the course of an analysis you may wish to save information from a given model, for example, to use the output as the basis for a simulation in Mplus, or to perform certain types of model diagnostics. The savedata: command allows the user to save information from a model in a text file. This information can then be used by Mplus, or read into another statistical package. Unlike the output files, which are formatted for human readers, the files created by savedata: are intended for Mplus or other programs to read, thus results are often saved in a relatively unadorned text format, and values are often in scientific notation. This page shows only a few of the savedata: options, see the Mplus manual for a full listing of available savedata: options. In some cases more model specific information can be saved, for example, one can request factor scores be saved after a confirmatory factor analysis.
Mplus version 5.2 was used for these examples.
1.0 Saving the Data Used in Estimation
The file option of the savedata: command allows you to save the variables used in the analysis to a text file. All variables used in the analysis, including variables that are transformations of other variables, are saved. Categorical variables that have been recoded and weight variables that have been rescaled by Mplus are saved in their new form. Additional variables can be saved using the auxiliary option of the variable: command. The name of the new file follows the file is option, in this case the file name is newdata, with the extension .dat . If no extension is given then the file is produced without one. The input model below is a relatively simple path model, but this command is available for a variety of models. No changes to the model, other than the addition of the savedata: command and file option, are necessary.
Title: Saving data used in estimation Data: file is /mplus/seminars/introMplus_part2/path.dat ; Variable: Names are hs gre col grad; Model: gre on hs col; grad on hs col gre; hs with col; savedata: file is newdata.dat;
Below is a portion of the output from the above input file. The omitted output is identical to the output from this model an otherwise identical model run without the savedata: command because the savedata: command does not change the model. The savedata: command dose result in some additional output at the very bottom of the output file (shown). Among other information, the additional output gives the order of variables in the new dataset, and the format they are saved in.
<output omitted> QUALITY OF NUMERICAL RESULTS Condition Number for the Information Matrix 0.348E-04 (ratio of smallest to largest eigenvalue) SAVEDATA INFORMATION Order and format of variables GRE F10.3 GRAD F10.3 HS F10.3 COL F10.3 Save file newdata.dat Save file format 4F10.3 Save file record length 5000
The file produced by the file option of the savedata: command contains one line for each case used to estimate our model. The first few lines of the file newdata.dat are shown below.
52.000 57.000 57.000 41.000 59.000 61.000 68.000 53.000 33.000 31.000 44.000 54.000 44.000 56.000 63.000 47.000 52.000 61.000 47.000 57.000 <output omitted>
1.1 Adding Measures of Influence to Saved Data
The log likelihood distance measure of influence, and/or Cook’s D can be requested in conjunction with the file option of the savedata: command. Including save = influence; or save = cooks; adds the log likelihood (influence) and/or Cook’s D (cooks) measure of influence for each case to the file containing the data used in estimation (i.e. the file specified by the file is option). Below we have used save = influence cooks; to request both measures.
Title: Save data + ll distance + Cook's D Data: file is /mplus/seminars/introMplus_part2/path.dat ; Variable: Names are hs gre col grad; Model: gre on hs col; grad on hs col gre; hs with col; savedata: file is influence.dat; save = influence cooks;
Below is a portion of the output from the above input file. The output is similar to that from savedata: with only the file option, except that two additional variables, outinfl and outcook are included in the saved dataset.
SAVEDATA INFORMATION Order and format of variables GRE F10.3 GRAD F10.3 HS F10.3 COL F10.3 OUTINFL F10.3 OUTCOOK F10.3 Save file influence.dat Save file format 6F10.3 Save file record length 5000
As with the previous example, the file influence.dat contains one line for each case used to estimate the model. The first few lines of the file influence.dat are shown below. Note that the file now contains six variables (each in its own column), the four observed variables, plus two variables containing the value of the influence statistics for each case.
52.000 57.000 57.000 41.000 0.075 0.074 59.000 61.000 68.000 53.000 0.054 0.054 33.000 31.000 44.000 54.000 0.276 0.270 44.000 56.000 63.000 47.000 0.114 0.113 52.000 61.000 47.000 57.000 0.036 0.036
2.0 Saving Sample Statistics (Correlation and Covariance Matrices)
The sample option of the savedata: command saves a sample correlation or covariance matrix in a text file. By default a covariance matrix is produced if all of the variables are continuous, and a correlation matrix is produced if the variables are categorical or a mix of categorical and continuous. The sample option both requests the additional output and specifies the name of the file, in this case, sampledata.dat. The input file below includes the savedata: command with the sample option.
Title: Saving correlation and voariace matrices Data: file is /mplus/seminars/introMplus_part2/path.dat ; Variable: Names are hs gre col grad; Model: gre on hs col; grad on hs col gre; hs with col; savedata: sample is sampledata.dat;
The output associated with the sample option of the savedata: command is shown below.
SAVEDATA INFORMATION Sample/H1/Pooled-Within Matrix Save file sampledata.dat Save type COVARIANCE Save format Free
The entire contents of the file sampledata.dat is shown below. The file contains two lines, each with values that appear in five columns, for a total of ten values, which happens to be the number of unique covariances/correlations in a matrix with four variables (recall that the number of unique values in a covariance matrix is n*(n+1)/2 where n is the number of variables). Note that the values are given in scientific notation.
0.89394375E+02 0.61236125E+02 0.11468097E+03 0.57706750E+02 0.68066850E+02 0.10459710E+03 0.54555125E+02 0.54488775E+02 0.63296650E+02 0.87328975E+02