Data From Table 16.3, page 355
The data from table 16.3 can be set up two ways. The first way is as a ‘narrow
format’ (table16_3), which enters each score on a separate record, while the
‘wide format’ (table16_3w) enters all of the scores for the within subjects
variable on the same record. SAS’s proc glm can analyze either format,
while proc mixed can only handle the narrow format.
data table16_3; input a s y; datalines; 1 1 745 2 1 764 3 1 774 1 2 777 2 2 786 3 2 788 1 3 734 2 3 733 3 3 763 1 4 779 2 4 801 3 4 797 1 5 756 2 5 786 3 5 785 1 6 721 2 6 732 3 6 740 ; run; data table16_3w; input s a1 a2 a3; datalines; 1 745 764 774 2 777 786 788 3 734 733 763 4 779 801 797 5 756 786 785 6 721 732 740 ; run;
Table 16.3, page 355. Summary of the Analysis of Variance for a single-factor within-subject design;
This example will be solved four ways.
Through proc glm there are two ways of evaluating the effect of factor A with the data in a long format.
The first way approaches the ANOVA as a simple two-factor design treating subjects ‘s’ as a blocking factor.
proc glm data = table16_3; class a s; model y = a s / ss3; run; quit;
The GLM Procedure
Class Level Information
Class Levels Values
a 3 1 2 3
s 6 1 2 3 4 5 6
Number of observations 18
Dependent Variable: y
Sum of
Source DF Squares Mean Square F Value Pr > F
Model 7 10122.83333 1446.11905 26.50 <.0001
Error 10 545.66667 54.56667
Corrected Total 17 10668.50000
R-Square Coeff Var Root MSE y Mean
0.948853 0.966243 7.386925 764.5000
Source DF Type III SS Mean Square F Value Pr > F
a 2 1575.000000 787.500000 14.43 0.0011
s 5 8547.833333 1709.566667 31.33 <.0001
The second approach models the factors and their interaction (this expansion is done through the ‘|’ in the model) and explicitly requires through the test command, designation of the numerator effect h and the error term (denominator) e.
proc glm data = table16_3; class a s; model y = a|s /ss3; test h=a e=a*s; run; quit;
The GLM Procedure
Class Level Information
Class Levels Values
a 3 1 2 3
s 6 1 2 3 4 5 6
Number of observations 18
Dependent Variable: y
Sum of
Source DF Squares Mean Square F Value Pr > F
Model 17 10668.50000 627.55882 . .
Error 0 0.00000 .
Corrected Total 17 10668.50000
R-Square Coeff Var Root MSE y Mean
1.000000 . . 764.5000
Source DF Type III SS Mean Square F Value Pr > F
a 2 1575.000000 787.500000 . .
s 5 8547.833333 1709.566667 . .
a*s 10 545.666667 54.566667 . .
Tests of Hypotheses Using the Type III MS for a*s as an Error Term
Source DF Type III SS Mean Square F Value Pr > F
a 2 1575.000000 787.500000 14.43 0.0011
A third method is to use proc glm with data in wide format. This requires the left side of the model statement list the dependent variables forming the levels of the within-subjects factor. The repeated statement is used to indicate that the variables on the left side of the model to be treated as a within-subjects factor.
proc glm data = table16_3w; model a1 a2 a3 = / ss3; repeated a 3; run; quit;
[a-level output omitted]
The GLM Procedure
Repeated Measures Analysis of Variance
Repeated Measures Level Information
Dependent Variable a1 a2 a3
Level of a 1 2 3
Manova Test Criteria and Exact F Statistics for the Hypothesis of no a Effect
H = Type III SSCP Matrix for a
E = Error SSCP Matrix
S=1 M=0 N=1
Statistic Value F Value Num DF Den DF Pr > F
Wilks' Lambda 0.08092840 22.71 2 4 0.0065
Pillai's Trace 0.91907160 22.71 2 4 0.0065
Hotelling-Lawley Trace 11.35660105 22.71 2 4 0.0065
Roy's Greatest Root 11.35660105 22.71 2 4 0.0065
Repeated Measures Analysis of Variance
Univariate Tests of Hypotheses for Within Subject Effects
Adj Pr > F
Source DF Type III SS Mean Square F Value Pr > F G - G H - F
a 2 1575.000000 787.500000 14.43 0.0011 0.0029 0.0011
Error(a) 10 545.666667 54.566667
Greenhouse-Geisser Epsilon 0.8052
Huynh-Feldt Epsilon 1.1302
A fourth method is via the use of proc mixed , which uses data in a long format. This requires the class statement to define identifiers for both factors and subjects, however the model statement is to include the non-subject factors. The repeated statement is used to indicate that the data comes from a repeated measures (within-subjects) design. The subject=s indicates that the variable ‘s’ defines the different subjects, and type=cs specifies the type covariance matrix, it this instance it is assumed to have the structure of compound symmetry.
proc mixed data = table16_3; class a s ; model y = a; repeated/ subject = s type = cs; run; quit;
The Mixed Procedure
Model Information
Data Set WORK.TABLE16_3
Dependent Variable y
Covariance Structure Compound Symmetry
Subject Effect s
Estimation Method REML
Residual Variance Method Profile
Fixed Effects SE Method Model-Based
Degrees of Freedom Method Between-Within
Class Level Information
Class Levels Values
a 3 1 2 3
s 6 1 2 3 4 5 6
Dimensions
Covariance Parameters 2
Columns in X 4
Columns in Z 0
Subjects 6
Max Obs Per Subject 3
Observations Used 18
Observations Not Used 0
Total Observations 18
Iteration History
Iteration Evaluations -2 Res Log Like Criterion
0 1 144.05240866
1 1 125.15764239 0.00000000
Convergence criteria met.
Covariance Parameter Estimates
Cov Parm Subject Estimate
CS s 551.67
Residual 54.5667
Fit Statistics
-2 Res Log Likelihood 125.2
AIC (smaller is better) 129.2
AICC (smaller is better) 130.2
BIC (smaller is better) 128.7
Null Model Likelihood Ratio Test
DF Chi-Square Pr > ChiSq
1 18.89 <.0001
Type 3 Tests of Fixed Effects
Num Den
Effect DF DF F Value Pr > F
a 2 10 14.43 0.0011
Table 16.6, page 360. Testing a within-subject contrast in a single-factor within-subject design
The simplest way to perform a within-subject contrast in single-factor within-subject design on a narrow formatted data set through proc glm is to create a new variable that defines the contrast over the within-subject factor. It is then treated as a continuous variable and is interacted with the subject variable in the model statement (NOTE: The factor which the contrast over is no longer in the model). The main effect of the contrast variable is tested against the interaction between subject and the contrast variable in the test statement.
data table16_3; set table16_3; if a=1 then c=-1; if (a=2 or a=3) then c=.5; run;
proc glm data = table16_3; class s; model y = s|c/ss3; test h=c e=c*s; run; quit;
The GLM Procedure
Class Level Information
Class Levels Values
s 6 1 2 3 4 5 6
Number of observations 18
Dependent Variable: y
Sum of
Source DF Squares Mean Square F Value Pr > F
Model 11 10126.00000 920.54545 10.18 0.0049
Error 6 542.50000 90.41667
Corrected Total 17 10668.50000
R-Square Coeff Var Root MSE y Mean
0.949149 1.243789 9.508768 764.5000
Source DF Type III SS Mean Square F Value Pr > F
s 5 8547.833333 1709.566667 18.91 0.0013
c 1 1406.250000 1406.250000 15.55 0.0076
c*s 5 171.916667 34.383333 0.38 0.8460
Tests of Hypotheses Using the Type III MS for c*s as an Error Term
Source DF Type III SS Mean Square F Value Pr > F
c 1 1406.250000 1406.250000 40.90 0.0014
The contrast done on the within-subject factor on a wide formatted data set is specified through the manova command in proc glm. The h option specifies the effects in the preceding model to use as hypothesis matrices, and _ALL_ provides tests for all effects listed in the model statement. Through the m option, the contrast on the dependent variables are established.
proc glm data = table16_3w; model a1 a2 a3 = /ss3; repeated a 3; manova h = _ALL_ m = (-1 .5 .5); run; quit;
[a-level output omitted]
Repeated Measures Analysis of Variance
Repeated Measures Level Information
Dependent Variable a1 a2 a3
Level of a 1 2 3
Manova Test Criteria and Exact F Statistics for the Hypothesis of no a Effect
H = Type III SSCP Matrix for a
E = Error SSCP Matrix
S=1 M=0 N=1
Statistic Value F Value Num DF Den DF Pr > F
Wilks' Lambda 0.08092840 22.71 2 4 0.0065
Pillai's Trace 0.91907160 22.71 2 4 0.0065
Hotelling-Lawley Trace 11.35660105 22.71 2 4 0.0065
Roy's Greatest Root 11.35660105 22.71 2 4 0.0065
Univariate Tests of Hypotheses for Within Subject Effects
Adj Pr > F
Source DF Type III SS Mean Square F Value Pr > F G - G H - F
a 2 1575.000000 787.500000 14.43 0.0011 0.0029 0.0011
Error(a) 10 545.666667 54.566667
Greenhouse-Geisser Epsilon 0.8052
Huynh-Feldt Epsilon 1.1302
M Matrix Describing Transformed Variables
a1 a2 a3
MVAR1 -1 0.5 0.5
Multivariate Analysis of Variance
Characteristic Roots and Vectors of: E Inverse * H, where
H = Type III SSCP Matrix for Intercept
E = Error SSCP Matrix
Variables have been transformed by the M Matrix
Characteristic Characteristic Vector V'EV=1
Root Percent MVAR1
8.17983519 100.00 0.06227237
MANOVA Test Criteria and Exact F Statistics for the Hypothesis of No Overall Intercept Effect
on the Variables Defined by the M Matrix Transformation
H = Type III SSCP Matrix for Intercept
E = Error SSCP Matrix
S=1 M=-0.5 N=1.5
Statistic Value F Value Num DF Den DF Pr > F
Wilks' Lambda 0.10893442 40.90 1 5 0.0014
Pillai's Trace 0.89106558 40.90 1 5 0.0014
Hotelling-Lawley Trace 8.17983519 40.90 1 5 0.0014
Roy's Greatest Root 8.17983519 40.90 1 5 0.0014
