This is a draft version of this chapter. Comments and suggestions to improve this draft are welcome.
Chapter Outline
6.1. Analysis with 2 categorical variables
6.2. Simple effects
6.2.1 Analyzing Simple Effects Using MANOVA and GLM
6.2.2 Analyzing Simple Effects Using REGRESSION
6.3. Simple Comparisons
6.3.1 Analyzing Simple Comparisons Using MANOVA and GLM
6.3.2 Analyzing Simple Comparisons Using REGRESSION
6.4. Partial Interaction
6.4.1 Analyzing partial interactions Using MANOVA and GLM
6.4.2 Analyzing partial interactions Using REGRESSION
6.5. Interaction contrasts
6.5.1 Analyzing interaction contrasts using MANOVA and GLM
6.5.2 Analyzing interaction contrasts using REGRESSION
6.6 Computing Adjusted Means
6.6.1 Computing Adjusted Means via MANOVA and GLM
6.6.2 Computing
Adjusted Means via REGRESSION
6.7 More Details on Meaning of the Coefficients
6.8 Simple Effects via Dummy Coding vs. Effect Coding
6.8.1 Example 1. Simple effects of yr_rnd at levels of mealcat
6.8.2 Example 2. Simple effects of mealcat at levels of yr_rnd
For this chapter we will use the elemapi2 data file that we have been using in prior chapters. We will focus on the variables mealcat, and collcat as they relate to the outcome variable api00 (performance on the api in the year 2000. The variable mealcat is the variable meals broken up into 3 categories, and the variable collcat is the variable some_col broken into 3 categories. We could think of mealcat as being the number of students receiving free meals and broken up into low, middle and high. The variable collcat can be thought of as the number of parents with some college education, and we could think of it as being broken up into low, medium and high. For our analysis, we think that both mealcat and collcat may be related to api00, but it is also possible that the impact of mealcat might depend on the level of collcat. In other words, we think that there might be an interaction of these two categorical variables. In this chapter we will look at how these two categorical variables are related to api performance in the school, and we will look at the interaction of these two categorical variables as well. We will see that there is an interaction of these categorical variables, and will focus on different ways of further exploring the interaction.
We will first input the elemapi2 data file and have a quick look at the three variables we are interested in.
get file = "c:spssregelemapi2.sav".means tables= api00 by mealcat by collcat /cells=mean.
We drop the label for mealcat because this can get in the way at some of the points we will be demonstrating.
value labels mealcat.
6.1. Analysis with 2 categorical variables
One traditional way to analyze this would be to perform a 3 by 3 factorial analysis of variance using the glm command, as shown below. The results show a main effect of collcat (F=4.5, p-0.0117), a main effect of mealcat (F=509.04, p=0.0000) and an interaction of collcat by mealcat, (F=6.63, p=0.0000).
glm api00 by collcat mealcat /plot = profile( mealcat*collcat ) /emmeans = tables(collcat*mealcat).
The option emmeans (which stands for Estimated Marginal Means) gives the adjusted means broken down by collcat and mealcat shown below.
We can show a graph of the adjusted means as shown below. This is done with the option plot in glm procedure.
We can do these same analyses using the regress command. Below we first create simple regression coding for both variables collcat and mealcat. Then we use the regression procedure based on those variables.
recode collcat (1=-.66667) (2=.33333) (3=.33333) into x2. recode collcat (1=.33333) (2=-.66667) (3=.33333) into x3. recode mealcat (1=-.66667) (2=.33333) (3=.33333) into y2. recode mealcat (1= .33333) (2=-.66667) (3=.33333) into y3. compute x2y2 = x2*y2. compute x2y3 = x2*y3. compute x3y2 = x3*y2. compute x3y3 = x3*y3. execute . regression /dependent api00 /method=enter x2 x3 y2 y3 x2y2 x2y3 x3y2 x3y3.
We use the test command to test the two terms associated with collcat to get the main effect of collcat.
regression /dependent api00 /method=enter y2 y3 x2y2 x2y3 x3y2 x3y3 /method = test(x2 x3).
Likewise we use the test command to get the test on main effect of mealcat.
regression /dependent api00 /method=enter x2 x3 x2y2 x2y3 x3y2 x3y3 /method = test(y2 y3).
Finally, we use the test command to test the interaction of of collcat by mealcat.
regression /dependent api00 /method=enter x2 x3 y2 y3 /method = test(x2y2 x2y3 x3y2 x3y3).
First, note that the results of the test commands correspond to those from the glm command above. This is because collcat and mealcat were coded using simple effect coding, a coding scheme where the contrasts sum to 0. If this had been coded using dummy coding, then the results of the test commands for mealcat and collcat from the regress command would not have corresponded to the glm results. In addition to simple coding, we could have used deviation or helmert coding schemes and the results of the test commands would have matched with the anova result from glm command, although the meaning of the individual tests would have been different. This point will be explored in more detail later in this chapter.
We can obtain the adjusted means by using predict command to get the predicted values, calling them pred and then looking at the mean of pred broken down by collcat and mealcat.
regression /dependent api00 /method=enter x2 x3 x2y2 x2y3 x3y2 x3y3 y2 y3 /save pred(pred). means pred by mealcat by collcat.
The graph of the cell means from glm procedure illustrates the interaction between collcat and mealcat. The graph shows the 3 levels of collcat as 3 different lines, and the 3 levels of mealcat as the 3 values on the x axis of the graph. We can see that the effect of collcat differs based on the level of mealcat. For example, when mealcat is low, schools where collcat is 3 have the lowest api00 scores, as compared to schools that are medium or high on mealcat, where schools with collcat of 3 have the highest api00 scores.
Let’s investigate this interaction further by looking at the simple effects of collcat at each level of mealcat.
6.2. Simple effects
We found that the main effect of collcat was significant, but because we have an interaction the effect of collcat depends on the level of mealcat. We might want to ask whether the effect of collcat is significant at each level of mealcat.
6.2.1 Analyzing Simple Effects Using MANOVA and GLM
In SPSS, we can use either MANOVA procedure or GLM procedure in order to look at the simple effects of a variable. For example, in order to look at the simple effect of collcat at the different levels of mealcat, we can use the following MANOVA statement.
manova api00 by collcat(1,3) mealcat(1,3) /error = w /design = mealcat collcat within mealcat(1) collcat within mealcat(2) collcat within mealcat(3).* * * * * * A n a l y s i s o f V a r i a n c e * * * * * * 400 cases accepted. 0 cases rejected because of out-of-range factor values. 0 cases rejected because of missing data. 9 non-empty cells. 1 design will be processed. * * * * * * A n a l y s i s o f V a r i a n c e -- design 1 * * * * Tests of Significance for API00 using UNIQUE sums of squares Source of Variation SS DF MS F Sig of F WITHIN CELLS 1829957.19 391 4680.20 MEALCAT 4764843.56 2 2382421.8 509.04 .000 COLLCAT WITHIN MEALC 50909.25 2 25454.62 5.44 .005 AT(1) COLLCAT WITHIN MEALC 68628.74 2 34314.37 7.33 .001 AT(2) COLLCAT WITHIN MEALC 29979.15 2 14989.57 3.20 .042 AT(3) (Model) 6243714.81 8 780464.35 166.76 .000 (Total) 8073672.00 399 20234.77 R-Squared = .773 Adjusted R-Squared = .769
we can also use glm procedure with the emmeans statement. We can obtain the simple effect of collcat at each level of mealcat using the compare option. This shows that the effect of collcat at each level of mealcat.
glm api00 by collcat mealcat /emmeans tables(collcat*mealcat) compare(collcat).
This shows that collcat is significant at each level of mealcat , if we use an alpha level of 0.05. We should note that since we are doing a number of additional tests, you might want to consider using post hoc corrections, such as a bonferoni correction to avoid Type I errors.
In summary, all 3 of the simple effects of collcat at each level of mealcat were significant, however the effect of collcat when mealcat was 3 might not be significant if we used a post hoc criteria for evaluating its significance.
6.2.2 Analyzing Simple Effects Using REGRESSION
We have demonstrated how to test the simple effect of collcat at each level of mealcat using GLM procedure in the previous section. That is through the approach of ANOVA. We can also obtain the same analysis through regression approach. After all, Anova is regression. In regression approach, we will create the coding for variable collcat, mealcat and their interaction. The coding scheme is specific for the effect we want to see. For example, in this section, we will do an analysis parallel to the previous section. That is to say that we want to see the simple effect of collcat at each level of mealcat. We will use simple coding for mealcat, though in our case the type of coding for mealcat does not really matter. The scheme for simple coding is shown chapter 5. The reference group for mealcat is group 1.
recode mealcat (1=.33333) (2=.33333) (3=-.66667) into mcat1. recode mealcat (1= .33333) (2=-.66667) (3=.33333) into mcat2.
We use helmert coding for collcat. We should note that these terms are not used in the analysis, but are used for creating the simple effects of collcat at each level of mealcat.
recode collcat (1=.66667) (2=-.33333) (3=-.33333) into ccat1. recode collcat (1=0) (2=.5) (3=-.5) into ccat2. compute c1m1 = 0. compute c2m1 = 0. compute c1m2 = 0. compute c2m2 = 0. compute c1m3 = 0. compute c2m3 = 0. if ( mealcat = 1) c1m1 = ccat1. if ( mealcat = 1) c2m1 = ccat2. if ( mealcat = 2) c1m2 = ccat1. if ( mealcat = 2) c2m2 = ccat2. if ( mealcat = 3) c1m3 = ccat1. if ( mealcat = 3) c2m3 = ccat2.
Now, that we have seen the helmert coding for collcat, we can see how this is used to create the simple effects of collcat at each level of mealcat. First, we look at the two comparisons of collcat at mealcat of 1. Note that the coding is the same as we saw above, but only when mealcat is 1, otherwise these variables are coded 0. Likewise, we look at the terms that form the effects of collcat when mealcat is 2, and we see that the variables are coded the same way when mealcat is 2, and otherwise 0. The same is true for the case when mealcat is 3. The following matrix is the coding we just used for all the interaction terms.
collcat | mealcat | c1m1 | c2m1 | c1m2 | c2m2 | c1m3 | c2m3 |
1 | 1 | 2/3 | 0 | 0 | 0 | 0 | 0 |
2 | 1 | -1/3 | 1/2 | 0 | 0 | 0 | 0 |
3 | 1 | -1/3 | -1/2 | 0 | 0 | 0 | 0 |
1 | 2 | 0 | 0 | 2/3 | 0 | 0 | 0 |
2 | 2 | 0 | 0 | -1/3 | 1/2 | 0 | 0 |
3 | 2 | 0 | 0 | -1/3 | -1/2 | 0 | 0 |
1 | 3 | 0 | 0 | 0 | 0 | 2/3 | 0 |
2 | 3 | 0 | 0 | 0 | 0 | -1/3 | 1/2 |
3 | 3 | 0 | 0 | 0 | 0 | -1/3 | -1/2 |
Now we are ready for our regression analysis. The test statement used below is for testing the simple effect of collcat at mealcat = 1.
regression /dependent api00 /method=enter mcat1 mcat2 c1m2 c2m2 c1m3 c2m3 /method = test(c1m1 c2m1).
We now see the simple effect of collcat when mealcat = 1 from the ANOVA output and we also see the regression estimates from the regression table. This illustrates how we have coded variables to allow the simple effects analysis. We can get the same analysis for the case when mealcat is 2 or 3 using different test statements. If you wished, you could manually create variables according to this strategy to perform a simple effects analysis.
6.3 Simple Comparisons
In the analyses above we looked at the simple effect of collcat at each level of mealcat. For example, we looked at the overall effect of collcat when mealcat was 1. This is the simple effect of collcat at mealcat=1. Because collcat has more than 2 levels, we may wish to make further comparisons among the 3 levels of collcat within mealcat=1. Simple comparisons allow us to make such comparisons.
6.3.1 Analyzing Simple Comparisons Using MANOVA and GLM
We can also look at the simple comparisons either using MANOVA or GLM as we did in Section 6.2. First we look at the comparison of collcat 1 vs. 2 and 3 when mealcat is 1 and then at the comparison of collcat 2 vs. 3. Let’s look at the MANOVA code first and its output first.
manova api00 by collcat(1,3) mealcat(1,3) /error = w /contrast (collcat)=helmert /design = mealcat collcat within mealcat(1) collcat within mealcat(2) collcat within mealcat(3). * * * * * * A n a l y s i s o f V a r i a n c e * * * * * * 400 cases accepted. 0 cases rejected because of out-of-range factor values. 0 cases rejected because of missing data. 9 non-empty cells. 1 design will be processed. * * * * * * A n a l y s i s o f V a r i a n c e -- design 1 * * * * * * Tests of Significance for API00 using UNIQUE sums of squares Source of Variation SS DF MS F Sig of F WITHIN CELLS 1829957.19 391 4680.20 MEALCAT 4764843.56 2 2382421.8 509.04 .000 COLLCAT WITHIN MEALC 50909.25 2 25454.62 5.44 .005 AT(1) COLLCAT WITHIN MEALC 68628.74 2 34314.37 7.33 .001 AT(2) COLLCAT WITHIN MEALC 29979.15 2 14989.57 3.20 .042 AT(3) (Model) 6243714.81 8 780464.35 166.76 .000 (Total) 8073672.00 399 20234.77 R-Squared = .773 Adjusted R-Squared = .769 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Estimates for API00 --- Individual univariate .9500 confidence intervals MEALCAT Parameter Coeff. Std. Err. t-Value Sig. t Lower -95% CL- Upper 2 158.150541 5.21975 30.29846 .00000 147.88824 168.41284 3 -22.890813 5.49562 -4.16528 .00004 -33.69548 -12.08614 COLLCAT WITHIN MEALCAT(1) Parameter Coeff. Std. Err. t-Value Sig. t Lower -95% CL- Upper 4 13.0132326 13.52800 .96195 .33667 -13.58349 39.60995 5 43.5002194 14.04092 3.09810 .00209 15.89507 71.10536 COLLCAT WITHIN MEALCAT(2) Parameter Coeff. Std. Err. t-Value Sig. t Lower -95% CL- Upper 6 -56.771166 16.67866 -3.40382 .00073 -89.56223 -23.98010 7 -19.033030 13.29175 -1.43194 .15296 -45.16528 7.09922 COLLCAT WITHIN MEALCAT(3) Parameter Coeff. Std. Err. t-Value Sig. t Lower -95% CL- Upper 8 -31.364414 12.86955 -2.43710 .01525 -56.66658 -6.06225 9 -32.900000 20.23653 -1.62577 .10480 -72.68603 6.88603Since we only look at the comparison when mealcat is 1, we only look at the section of the output for COLLCAT WITHIN MEALCAT(1). Parameter 4 is the comparison of collcat 1 vs. 2 and 3 and parameter 5 is the comparison of collcat 2 vs. 3. We see that the collcat 1 is not significantly different from 2 and 3 at mealcat =1 since the t-value is .96 and the p-value is .337, but collcat 2 is significantly different from 3 at mealcat =1 with t-value = 3.10 and p-value = .0021.
Now we will use GLM to get the same results. With GLM, we have to use the lmatrix statement and manually put the helmert coding in. Since we are only interested in the comparison of collcat 1 vs. 2 and 3 at mealcat =1, we leave the last two columns for the interaction collcat*mealcat to be zero because they correspond to the level of 2 and 3 of mealcat.
glm api00 by collcat mealcat /lmatrix = 'effect of collcat 1 vs 2+ at mealcat = 1' collcat 1 -1/2 -1/2 collcat*mealcat 1 0 0 -1/2 0 0 -1/2 0 0.
glm api00 by collcat mealcat /lmatrix= 'collcat 2 vs. 3 at mealcat = 1' collcat 0 1 -1 collcat*mealcat 0 0 0 1 0 0 -1 0 0.
6.3.2 Analyzing Simple Comparisons Using REGRESSION
In the analyses above we used helmert coding for collcat. We chose this coding so we could compare group 1 with groups 2 and 3 and then compare groups 2 and 3. For example, if we wanted to compare collcat 1 vs. 2 and 3, we would want to look at the effect c1m1, and if we wanted to compare collcat groups 2 and 3 when mealcat is 1, then we would look at the effect c2m1.
We can use the REGRESSION procedure as above to see the effects for these terms.
regression /dependent api00 /method=enter mcat1 mcat2 c1m1 c2m1 c1m2 c2m2 c1m3 c2m3.
We see that the collcat 1 is not significantly different from 2&3 at mealcat 1 (t=.96, p=.337), but collcat 2 is significantly different from 3 at mealcat 1 (t=3.10, p=0.002).
6.4. Partial Interaction
A partial interaction allows you to apply contrasts to one of the effects in an interaction term. For example, we can draw the interaction of collcat by mealcat like this below.
Collcat low | Collcat Med | Collcat High | |
Mealcat Low | |||
Mealcat Med | |||
Mealcat High |
Say that we wanted to compare, in the context of this interaction, group 1 for collcat vs. groups 2 and 3. The table of this partial interaction would look like this. The contrast coefficients of -2 1 1 applied to collcat indicate the comparison of group 1 for collcat vs. groups 2 and 3.
-2 | 1 | 1 | |
Collcat low | Collcat Med | Collcat High | |
Mealcat Low | |||
Mealcat Med | |||
Mealcat High |
Likewise, we also might want to compare groups 2 and 3 of collcat by mealcat, and the table of this interaction would look like this.
0 | -1 | 1 | |
Collcat low | Collcat Med | Collcat High | |
Mealcat Low | |||
Mealcat Med | |||
Mealcat High |
These are called partial interactions because contrast coefficients are applied to one of the terms involved in the interaction.
6.4.1 Analyzing partial interactions Using MANOVA and GLM
The MANOVA handles partial interaction quite easily. We are interested in the partial interaction of collcat comparing 1 vs. 2 and 3 by mealcat. So we have the helmert coding for collcat. collcat(1) is to compare collcat 1 vs. 2 and 3 and collcat(2) is to compare collcat 2 vs. 3.
manova api00 by collcat(1,3) mealcat(1,3) /error = w /contrast(collcat) = helmert /design = collcat mealcat collcat(1) by mealcat collcat(2) by mealcat. * * * * * * A n a l y s i s o f V a r i a n c e * * * * * * 400 cases accepted. 0 cases rejected because of out-of-range factor values. 0 cases rejected because of missing data. 9 non-empty cells. 1 design will be processed. * * * * * * A n a l y s i s o f V a r i a n c e -- design 1 * * * * * * Tests of Significance for API00 using UNIQUE sums of squares Source of Variation SS DF MS F Sig of F WITHIN CELLS 1829957.19 391 4680.20 COLLCAT 42140.57 2 21070.28 4.50 .012 MEALCAT 4764843.56 2 2382421.8 509.04 .000 COLLCAT(1) BY MEALCA 54141.41 2 27070.70 5.78 .003 COLLCAT(2) BY MEALCA 66511.60 2 33255.80 7.11 .001 (Model) 6243714.81 8 780464.35 166.76 .000 (Total) 8073672.00 399 20234.77 R-Squared = .773 Adjusted R-Squared = .769 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Estimates for API00 --- Individual univariate .9500 confidence intervals COLLCAT Parameter Coeff. Std. Err. t-Value Sig. t Lower -95% CL- Upper 2 -25.040783 8.34539 -3.00055 .00287 -41.44823 -8.63333 3 -2.8109369 9.32938 -.30130 .76335 -21.15296 15.53108 MEALCAT Parameter Coeff. Std. Err. t-Value Sig. t Lower -95% CL- Upper 4 158.150541 5.21975 30.29846 .00000 147.88824 168.41284 5 -22.890813 5.49562 -4.16528 .00004 -33.69548 -12.08614 COLLCAT(1) BY MEALCAT Parameter Coeff. Std. Err. t-Value Sig. t Lower -95% CL- Upper 6 38.0540153 11.43013 3.32927 .00095 15.58182 60.52621 7 -31.730384 12.74250 -2.49012 .01318 -56.78278 -6.67799 COLLCAT(2) BY MEALCAT Parameter Coeff. Std. Err. t-Value Sig. t Lower -95% CL- Upper 8 46.3111563 12.35933 3.74706 .00021 22.01210 70.61022 9 -16.222093 12.08005 -1.34288 .18009 -39.97206 7.52788
Looking at the output of Analysis of Variance, we see that the effect of collcat(1) is significant by mealcat. That means we have this partial interaction is significant. Similarly, we can look at the effect of collcat(2) from the output of Analysis of Variance and it is also significant.
With procedure GLM, we can test the partial interactions using the lmatrix statement. For example, we want to test the partial interaction of collcat comparing group 1 vs. 2 and 3 by mealcat, we can do the following lmatrix statement. Because mealcat has 2 degree of freedom, the test of partial interaction also has 2 degree of freedom. The 2 degree of freedom of factor mealcat can be broken down into 2 comparisons. These two interaction contrasts are separated by a semi-colon, which tells SPSS to join these contrasts together into a single test with 2 degrees of freedom.
glm api00 BY collcat mealcat /lmatrix = 'collcat 1 vs.2+ by mealcat' collcat*mealcat -1 0 1 1/2 0 -1/2 1/2 0 -1/2; collcat*mealcat 0 -1 1 0 1/2 -1/2 0 1/2 -1/2.
Similarly, we can test the two terms of interaction that involve the comparison of group 2 vs. 3 on collcat. We omit the syntax and the output here.
6.4.2 Analyzing partial interactions Using REGRESSION
With regression analysis, we can also compare groups 1 vs. 2 and 3 on collcat, or compare groups 2 and 3 on collcat. This implies Helmert coding on collcat, as we did before.
recode collcat (1=.66667) (2=-.33333) (3=-.33333) into ccat1 . recode collcat (1=0) (2=.5) (3=-.5) into ccat2 .
The coding for mealcat is chosen as dummy coding, but could have been any form of effect coding.
recode mealcat (1=1) (2=0) (3=0) into md1. recode mealcat (1=0) (2=1) (3=0) into md2.
The interaction terms are just the product of their respective main effects.
compute c1m1 = ccat1*md1. compute c2m1 = ccat2*md1. compute c1m2 = ccat1*md2. compute c2m2 = ccat2*md2. execute.
Under such coding scheme, the comparison of collcat 1 vs. 2 and 3 at mealcat 3 is simply ccat1, the comparison of collcat 1 vs. 2 and 3 at mealcat 1 is ccat1 + c1m1 and at mealcat 2 is ccat1 + c1m2. Therefore, to compare collcat group1 vs. group 2 and 3 across all levels of mealcat is the same as testing c1m1 = 0 and c1m2 = 0 simultaneously. Here is the regression with the test.
regression /dependent api00 /method=enter ccat1 ccat2 md1 md2 c2m1 c2m2 /method = test(c1m1 c1m2).
6.5. Interaction contrasts
Above we saw that a partial interaction allows you to apply contrast coefficients to one of the terms in a 2 way interaction. An interaction contrast allows you to apply contrast coefficients to both of the terms in a two way interaction.
For example, with respect to collcat, let’s say that we wish to compare groups 2 and 3, and with respect to mealcat we wish to compare groups 1 and 2. The table of this looks like this below.
-1 | 1 | 0 | ||
Collcat low | Collcat Med | Collcat High | ||
0 | Mealcat Low | |||
-1 | Mealcat Med | |||
1 | Mealcat High |
We also would like to form a second interaction contrast that also compares groups 2 and 3 with respect to collcat, and compares groups 2 and 3 on mealcat. A table of this comparison is shown below.
0 | -1 | 1 | ||
Collcat low | Collcat Med | Collcat High | ||
0 | Mealcat Low | |||
-1 | Mealcat Med | |||
1 | Mealcat High |
If we look at the graph of the predicted values (repeated below) we constructed before, it compares the dashed and dotted lines (collcat 2 vs. 3) by mealcat 1 vs. 2, and then again by mealcat 2 vs. 3.
6.5.1 Analyzing interaction contrasts using MANOVA and GLM
Because we would like to compare groups 1 vs. 2, and then groups 2 vs. 3 on mealcat, this implies forward difference coding for mealcat (which will compare 1 vs. 2, then 2 vs. 3). In SPSS, the forward difference coding is called repeated. For collcat we wish to compare groups 2 and 3, so we can use Helmert coding for that comparison as we did above (since this will compare 1 vs. 2 and 3, then 2 vs. 3).
manova api00 by collcat(1, 3) mealcat(1,3) /analysis api00 /error = w /contrast (collcat) = helmert /contrast (mealcat) = repeated /design = collcat, mealcat, collcat by mealcat.* * * * * * A n a l y s i s o f V a r i a n c e -- design 1 * * * * * * Tests of Significance for API00 using UNIQUE sums of squares Source of Variation SS DF MS F Sig of F WITHIN CELLS 1829957.19 391 4680.20 COLLCAT 42140.57 2 21070.28 4.50 .012 MEALCAT 4764843.56 2 2382421.8 509.04 .000 COLLCAT BY MEALCAT 124167.81 4 31041.95 6.63 .000 (Model) 6243714.81 8 780464.35 166.76 .000 (Total) 8073672.00 399 20234.77 R-Squared = .773 Adjusted R-Squared = .769 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Estimates for API00 --- Individual univariate .9500 confidence intervals COLLCAT Parameter Coeff. Std. Err. t-Value Sig. t Lower -95% CL- Upper 2 -25.040783 8.34539 -3.00055 .00287 -41.44823 -8.63333 3 -2.8109369 9.32938 -.30130 .76335 -21.15296 15.53108 MEALCAT Parameter Coeff. Std. Err. t-Value Sig. t Lower -95% CL- Upper 4 181.041353 9.07713 19.94479 .00000 163.19527 198.88743 5 112.368916 9.90759 11.34170 .00000 92.89009 131.84774 COLLCAT BY MEALCAT Parameter Coeff. Std. Err. t-Value Sig. t Lower -95% CL- Upper 6 69.7843988 21.47520 3.24953 .00126 27.56308 112.00571 7 -25.406752 21.06663 -1.20602 .22854 -66.82479 16.01128 8 62.5332494 19.33438 3.23430 .00132 24.52090 100.54560 9 13.8669700 24.21132 .57275 .56714 -33.73369 61.46763
Since we have chosen Helmert coding for collcat and forward difference coding for mealcat, the interaction terms are coded in the following way. Parameter 6 is for
collcat (1 vs. 2+) & mealcat (1 vs. 2), parameter 7 is for collcat (1 vs. 2+) & mealcat (2 vs. 3), parameter 8 is for collcat (2 vs. 3) & mealcat (1 vs. 2) and parameter 9 is for collcat (2 vs. 3) & mealcat (2 vs. 3).
Remember that our first interest is to compare collcat groups 2 and 3, and with respect to mealcat we wish to compare groups 1 and 2.This is tested by parameter 8 , and this term is significant. As we expect, the red and green lines are not parallel when we compare mealcat 1 and 2.
Our second interest is to compares groups 2 and 3 with respect to collcat, and compares groups 2 and 3 on mealcat. This is tested by parameter 9, and this term is not significant. Looking at the graph, we can see that the red and green lines are mostly parallel between mealcat 2 and 3.
We can also get the same analysis using GLM procedure. For example, in our first interaction effect analysis, we compare collcat group 2 vs. 3, and with respect to mealcat we compare groups 1 and 2, this leads to a column matrix for the effect of collcat as (0 1 -1)’ and a row matrix for the effect of mealcat (1 -1 0). This yields the lmatrix shown below.
glm api00 by collcat mealcat /lmatrix = 'collcat 2 vs. 3 by mealcat 1 vs. 2' collcat*mealcat 0 0 0 1 -1 0 -1 1 0.
In the same way, we will get our second analysis from the following.
glm api00 by collcat mealcat /lmatrix = 'collcat 2 vs. 3 by mealcat 2 vs. 3' collcat*mealcat 0 0 0 0 1 -1 0 -1 1.
6.5.2
Analyzing interaction contrasts using REGRESSION
In regression analysis, we have seen that difference coding schemes of the variables give us difference contrasts and comparisons. Because we would like to compare groups 1 vs. 2, and then groups 2 vs. 3 on mealcat, we will use forward difference coding for mealcat (which will compare 1 vs. 2, then 2 vs. 3).
recode mealcat (1=.66667) (2=-.33333) (3=-.33333) into mf1. recode mealcat (1=.33333) (2=.33333) (3=-.66667) into mf2. compute c1m1 = ccat1*mf1. compute c2m1 = ccat2*mf1. compute c1m2 = ccat1*mf2. compute c2m2 = ccat2*mf2. execute.
The regression analysis is then done and we can look at the coefficients for c2m1 and c2m2 to see the two comparisons that we have seen from the previous section.
regression /dependent api00 /method=enter ccat1 ccat2 mf1 mf2 c1m1 c1m2 c2m1 c2m2.
6.6 Computing Adjusted Means
6.6.1 Computing Adjusted Means via MANOVA and GLM
First, we show how you can compute adjusted means using the MANOVA command. Our model will be almost the same as before, in addition we include an additional covariate emer. MANOVA’s option pmeans handles adjusted means for us. These adjusted means compute the mean that would be expected if every school in the sample were at the mean for the variable emer. The syntax to get the adjusted means using manova is as follows. The last table from the output is the adjusted means adjusted by the mean of emer, called combined adjusted means in SPSS.
manova api00 by collcat(1,3) mealcat(1,3) with emer /analysis api00 with emer /pmeans tables(collcat*mealcat). * * * * * * A n a l y s i s o f V a r i a n c e -- design 1 * * * * * * Order of Variables for Analysis Variates Covariates API00 EMER 1 Dependent Variable 1 Covariate * * * * * * A n a l y s i s o f V a r i a n c e -- design 1 * * * * * * Tests of Significance for API00 using UNIQUE sums of squares Source of Variation SS DF MS F Sig of F WITHIN CELLS 1671243.73 390 4285.24 REGRESSION 158713.45 1 158713.45 37.04 .000 COLLCAT 34730.09 2 17365.04 4.05 .018 MEALCAT 3017331.85 2 1508665.9 352.06 .000 COLLCAT BY MEALCAT 96789.12 4 24197.28 5.65 .000 (Model) 6402428.26 9 711380.92 166.01 .000 (Total) 8073672.00 399 20234.77 R-Squared = .793 Adjusted R-Squared = .788 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Regression analysis for WITHIN CELLS error term --- Individual Univariate .9500 confidence intervals Dependent variable .. API00 api 2000 COVARIATE B Beta Std. Err. t-Value Sig. of t EMER -2.00997 -.16598 .330 -6.086 1.000 COVARIATE Lower -95% CL- Upper EMER -2.659 -1.361 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Adjusted and Estimated Means Variable .. API00 api 2000 CELL Obs. Mean Adj. Mean Est. Mean Raw Resid. Std. Resid. 1 816.914 797.802 816.914 .000 .000 2 589.350 597.215 589.350 .000 .000 3 493.919 510.114 493.919 .000 .000 4 825.651 812.792 825.651 .000 .000 5 636.605 636.647 636.605 .000 .000 6 508.833 524.126 508.833 .000 .000 7 782.151 768.177 782.151 .000 .000 8 655.638 653.218 655.638 .000 .000 9 541.733 550.703 541.733 .000 .000 * * * * * * A n a l y s i s o f V a r i a n c e -- design 1 * * * * * * Combined Adjusted Means for COLLCAT BY MEALCAT Variable .. API00 COLLCAT 1 2 3 MEALCAT 0-46% fr UNWGT. 797.80220 812.79202 768.17701 47-80% f UNWGT. 597.21459 636.64671 653.21792 81-100% UNWGT. 510.11402 524.12643 550.70340
We can get the same result through procedure GLM. The option emmeans (Estimated Marginal Means) gives the adjusted means.
glm api00 by collcat mealcat with emer /design collcat mealcat collcat*mealcat emer /emmeans = tables(collcat*mealcat).
6.6.2 Computing Adjusted Means via REGRESSION
Now we illustrate how to get the same adjusted means if you were to to the analysis via the REGRESSION command. First, we need to create all the necessary dummy variables for the categorical variables. The choice of coding schemes does not matter for the purpose of obtaining the adjusted means. We choose simple coding scheme for both mealcat and collcat below. Regression analysis is done using these dummy variables afterwards.
recode mealcat (1=-.33333) (2=-.33333) (3=.66667) into ms1. recode mealcat (1= -.33333) (2=.66667) (3=-.33333) into ms2. recode collcat (1=-.33333) (2=-.33333) (3=.66667) into cs1 . recode collcat (1=-.33333) (2=.66667) (3=-.33333) into cs2 . compute c1m1 = cs1*ms1. compute c2m1 = cs2*ms1. compute c1m2 = cs1*ms2. compute c2m2 = cs2*ms2. execute.
regression /dependent api00 /method=enter cs1 cs2 ms1 ms2 c1m1 c1m2 c2m1 c2m2 emer.
To create the adjusted means we wish to assume that all of the schools are at the average on the variable emer. Let us first find out the mean for emer.
descriptives variable=emer /statistics=mean.
Now we create yhat as the predicted value based on the regression equation setting emer at its mean. Since the value of emer is set to the mean of emer, this will be the predicted value assuming that all schools are at the average for emer.
compute yhat = 675.289 + 22.322*cs1 + 22.811*cs2 - 264.609*ms1 - 163.898*ms2 + 70.215*c1m1 + 85.629*c1m2 - .977*c2m1 + 24.442*c2m2 - 2.01*12.66. execute.
Now, we can look at the average of yhat broken down by collcat and mealcat, which you can see corresponds to the adjusted means that we found with glm command above.
means predy by collcat by mealcat /cells = mean count.
6.7 More Details on Meaning of the Coefficients
So far we have discussed a variety of techniques that you can use to help interpret interactions of categorical variables in regression, but we have not gone into a great detail about the meaning of the coefficients in these analyses. Let’s consider this further. Consider the analysis below using collcat and mealcat, using simple contrasts on both of these variables. The reference group for both variables will be group 1.
recode mealcat (1= -.33333) (2=.66667) (3=-.33333) into ms1. recode mealcat (1=-.33333) (2=-.33333) (3=.66667) into ms2. recode collcat (1=-.33333) (2=.66667) (3=-.33333) into cs1 . recode collcat (1=-.33333) (2=-.33333) (3=.66667) into cs2 . compute c1m1 = cs1*ms1. compute c2m1 = cs2*ms1. compute c1m2 = cs1*ms2. compute c2m2 = cs2*ms2. execute. regression /dependent api00 /method=enter cs1 cs2 ms1 ms2 c1m1 c1m2 c2m1 c2m2 /save pred(yht1).
We can produce the adjusted means as shown below. These will be useful for interpreting the meaning of the coefficients.
means yht1 by collcat by mealcat /cells = mean count.
Let’s consider the meaning of the coefficient for cs1. The coding for this variable compares group 2 vs. group 1, hence this coefficient corresponds to mean(collcat = 2) – mean(collcat = 1). Note that these are the unweighted means, so we compute the mean for collcat = 2 as the mean of the 3 cells corresponding to collcat = 2, i.e. (825.651+636.605+508.833)/3 . If we compare the result below to the coefficient for cs1 we see that they are the same,
(825.651+636.605+508.833)/3 – (816.914+589.35+493.919)/3 = 23.635333.
Likewise, the coefficient for cs2 is mean(collcat = 3) – mean(collcat = 1), computed below. The value below corresponds to the coefficient for cs2.
(782.151+655.638+541.733)/3 – (816.914+589.35+493.919)/3 = 26.446333
Likewise, the coefficient for ms1 works out to be mean(mealcat = 2) – mean(mealcat = 1), computed below.
(589.35+636.605+655.638)/3 – (816.914+825.651+782.151)/3 = -181.041.
And the coefficient for ms2 is mean(mealcat = 3) – mean(mealcat = 1), computed below.
(493.919+508.833+541.733)/3 – (816.914+825.651+782.151)/3 = -293.41033
To get the meaning of the coefficients for the interaction terms, let’s write out the regression equation and take a closer look at the coefficients. From the parameter estimates, we have the following linear equation for predicted values:
yhat = 650.090 + 23.635*cs1 + 26.446*cs2 - 181.042*ms1 - 293.412*ms2 + 38.518*cs1*ms1 + 6.178*cs1*ms2 + 101.051*cs2*ms1 + 82.578*cs2*ms2.
Because of the simple coding scheme we use for both variables, we have from the above equation,
yhat(collcat = 2) – yhat(collcat = 1) = 23.635 + 38.518*ms1 + 6.178*ms2.
One way to think about this equation is that for any level of mealcat comparing group 2 vs. group 1 on collcat only involves cs1. It then follows that the coefficient for c1m1 is to compare the difference of group 2 vs. 1 on collcat when mealcat is 2 with the difference of group 2 vs. 1 on collcat when mealcat is 1. In other words, c1m1 is
[cell(2,2)-cell(1,2)] – [cell(2,1)-cell(1,1)].
Plugging all the corresponding cell means to the above formula, we get
(636.6047 – 589.3500) – (825.6512 – 816.9143) = 38.5175,
which is the coefficient for c1m1. Using the same argument, we can have the following
c1m1 : [cell(2,2)-cell(1,2)] – [cell(2,1)-cell(1,1)],
c1m2 : [cell(2,3)-cell(1,3)] – [cell(2,1)-cell(1,1)],
c2m1 : [cell(3,2)-cell(1,2)] – [cell(3,1)-cell(1,1)],
c2m2 : [cell(3,3)-cell(1,3)] – [cell(3,1)-cell(1,1)].
We can go through the same process to verify the meaning of the coefficients for the other 3 interaction terms. We verify that c1m2 is 6.1775.
(508.8333 – 493.9189) – (825.6512 – 816.9143) = 6.1775.
We also verify that c2m1 is 101.051.
(655.6377 – 589.3500) – (782.1509 – 816.9143) = 101.0511.
Last we verify that c2m2 is 82.5778.
( 541.7333 – 493.9189) – ( 782.1509 – 816.9143) = 82.5778.
6.8 Simple Effects via Dummy Coding vs. Effect Coding
We have used in this chapter different types of coding schemes. You may wonder why we have gone to the effort of creating and testing these effects instead of just using dummy coding and what is the difference between different coding schemes and how to choose them. In this section, let’s compare how to get simple effects using the effect coding to how we would get simple effects using dummy coding. We hope to show that it is much easier to use effect coding so that the interpretation of the coefficients is much more intuitive.
6.8.1 Example 1. Simple effects of yr_rnd at levels of mealcat
Let’s use an example from Chapter 3 (section 3.5). In that example we looked at and analysis using mealcat and yr_rnd and the interaction of these two variables. First, we look at how to do a simple effects analysis looking at the simple effects of yr_rnd at each level of mealcat using effect coding. To make our results correspond to those from Chapter 3, we will make category 3 of mealcat the reference category.
recode mealcat (1= .66667) (2=-.33333) (3=-.33333) into ms1. recode mealcat (1=-.33333) (2=.66667) (3=-.33333) into ms2. recode yr_rnd (0=-.5) (1=.5) into yr1. compute ym1 = 0. compute ym2 = 0. compute ym3 = 0. if ( mealcat = 1) ym1 = yr1. if ( mealcat = 2) ym2 = yr1. if ( mealcat = 3) ym3 = yr1. regression /dependent api00 /method=enter ms1 ms2 ym1 ym2 ym3.
Now we can obtain the simple effect of yr_rnd at mealcat = 1 by inspecting the coefficient for ym1, the simple effect of yr_rnd at mealcat = 2 by inspecting the coefficient for ym2 and the simple effect of yr_rnd at mealcat = 3 by inspecting the coefficient for ym3.
Now let’s perform the same analysis using dummy coding. Again, we will explicitly make the 3rd category for mealcat to be the omitted category.
recode mealcat (1= 1) (2=0) (3=0) into md1. recode mealcat (1=0) (2=1) (3=0) into md2. compute ymd1 = yr_rnd*md1. compute ymd2 = yr_rnd*md2. regression /dependent api00 /method=enter yr_rnd md1 md2 ymd1 ymd2.
In order to form a test of simple main effects we need to make a table like the one shown below that relates the cell means to the coefficients in the regression. Please see Chapter 3, section 3.5 for information on how this table was constructed.
mealcat=1 mealcat=2 mealcat=3 ------------------------------------------------- yr_rnd=0 const const const + md1 + md2 ------------------------------------------------- yr_rnd=1 const const const + yr_rnd + yr_rnd + yr_rnd + md1 + md2 + ymd1 + ymd2
Let’s start by looking at how to get the simple effect of yr_rnd when mealcat is 3. Looking at the table above, we can see that we would want to compare const with const + yr_rnd, , which is the same as testing the coefficient for yr_rnd is zero. This is a single parameter test and is shown in the output above. The t-value is -2.846 and the p-value is .005.
Note that the coefficient for yr_rnd corresponds to the test of the effect of yr_rnd when all other variables are set to 0 (the reference category), i.e. when mealcat is set to the reference category. You may be tempted to interpret the coefficient for yr_rnd as the overall difference between year round schools and non-year round schools, but in this example we see that it really corresponds to the simple effect of yr_rnd. When using dummy coding people commonly misinterpret the lower order effects to refer to overall effects rather than simple effects.
Now let’s look at the simple effect of yr_rnd when mealcat=1. Looking at the table above we see that this involves the comparison of the coefficients for yr_rnd=1 vs. yr_rnd=0 when mealcat=1, i.e. comparing const + yr_rnd +md1 + ymd1 vs. const + md1. Removing the terms that drop out we see that to test the simple effect of yr_rnd when mealcat = 1 is the same to test yr_rnd + ymd1 = 0. This can NOT be done in SPSS through the test command of REGRESSION. We have to use ANOVA type of command to perform the test.
These examples illustrate that it is more complicated to form simple effects when using dummy coding, and also that the interpretation of lower order effects when using dummy coding may not have the meaning that you would expect.
6.8.2 Example 2. Simple effects of mealcat at levels of yr_rnd
Example 1 looked at simple effects for yr_rnd, a variable with only 2 levels and it showed that the REGRESSION procedure in SPSS is very limited on its test subcommand. In this example, let’s consider the simple effects of mealcat at each level of yr_rnd. Because mealcat has more than 2 levels, we will see what is required for doing tests of simple effects for variables with more than 2 levels. We will use procedure GLM to perform all the necessary tests to test the simple effects.
First, let’s show how to get these simple effects using the MANOVA.
manova api00 by yr_rnd(0,1) mealcat(1,3) /error = w /design = yr_rnd mealcat within yr_rnd(1) mealcat within yr_rnd(2).
* * * * * * A n a l y s i s o f V a r i a n c e * * * * * * 400 cases accepted. 0 cases rejected because of out-of-range factor values. 0 cases rejected because of missing data. 6 non-empty cells. 1 design will be processed. * * * * * * A n a l y s i s o f V a r i a n c e -- design 1 * * * * * * Tests of Significance for API00 using UNIQUE sums of squares Source of Variation SS DF MS F Sig of F WITHIN CELLS 1868944.18 394 4743.51 YR_RND 99617.37 1 99617.37 21.00 .000 MEALCAT WITHIN YR_RN 3903569.80 2 1951784.9 411.46 .000 D(1) MEALCAT WITHIN YR_RN 476157.45 2 238078.73 50.19 .000 D(2) (Model) 6204727.82 5 1240945.6 261.61 .000 (Total) 8073672.00 399 20234.77 R-Squared = .769 Adjusted R-Squared = .766
The simple effect of mealcat when yr_rnd = 0 is shown in the above ANOVA table with F-value 411.46 and p-value .000. The simple effect of mealcat when yr_rnd = 1 is significant with F-value 50.19. Now we show how to get the same analysis using GLM.
glm api00 by yr_rnd mealcat /emmeans tables(yr_rnd*mealcat) compare(mealcat).