MY SUMS OF SQUARES DON'T ADD UP! David P. Nichols Senior Support Statistician SPSS, Inc. From SPSS Keywords, Volume 53, 1994 Many users of SPSS are confused when they see output from REGRESSION, ANOVA or MANOVA in which the sums of squares for two or more factors or predictors do not add up to the total sum of squares for the model. This is especially true of users whose statistical training is primarily concerned with the analysis of variance of balanced data from planned experiments. Balanced, or orthogonal, designs are ones in which the independent or predictor variables (be they categorical factors or continuous predictors which are sometimes referred to as covariates) are uncorrelated in the sample. They allow for a simple additive partitioning of the sums of squares accounted for by a model into unique portions associated with each predictor variable. For this reason and for reasons of precision in statistical estimation they are generally preferred when they can be used. However, most of us are forced at least on occasion to work with data from messier nonorthogonal designs. When predictor variables are nonorthogonal (correlated), there is no unique way to compute an additive decomposition of the model sum of squares into individual parts. Although additive decompositions can be computed (using the SEQUENTIAL method in MANOVA or adding up changes in sums of squares as each variable is brought into a REGRESSION model one at a time), there is a different decomposition associated with each possible ordering of a set of predictor variables. In other words, the sum of squares attributed to a particular predictor variable depends not on the overall model, but on what predictors were entered into the model prior to that variable. This method is primarily used when certain variables are believed on theoretical grounds to have priority over others, and an ordering is thus considered appropriate. However, many users are unaware of the actual statistical hypotheses tested by this approach, which are often not the ones desired. More commonly users will want to compute sums of squares for each predictor in a symmetric fashion. These UNIQUE sums of squares have been the default in the MANOVA procedure for some time, and in the ANOVA procedure beginning with release 5. For continuous predictors and categorical factors which have only two categories (and are therefore represented by a single dummy or effect coded variable) the F-statistics associated with UNIQUE sums of squares are the squares of the t-statistics which result from dividing unstandardized regression coefficient estimates (in REGRESSION or MANOVA) or parameter estimates (in MANOVA) by their standard errors. Both types of sums of squares can be conceptualized and computed as differences between the residual or error sums of squares (SSE) resulting from fitting two hierarchical models. In the case of SEQUENTIAL sums of squares we begin with a model which includes only a constant or intercept term. The SSE would then be the total mean corrected sum of squares. If we add one predictor to the model (we can be referring here either to a single variable or a set of variables representing a categorical predictor) we will have a new SSE, which will be less than or equal to the previous value. The difference between the SSE before and after the addition of the new predictor is attributed to this predictor. We can then add another predictor and recalculate the model and SSE. The difference between the new SSE and the previous SSE is again attributed to the newly added predictor. We proceed with this process until we have entered all predictors, noting the SSE decrease at each stage. This process could also be performed in the reverse order, removing each predictor and noting the increase in SSE at each stage. The UNIQUE approach sums of squares (which can be obtained in REGRESSION by using the TEST option on the METHOD subcommand) can be computed by comparing the SSE for a model including all predictors to a model including all predictors except for the one of current interest. The increase in SSE resulting from removing a predictor from the model (which is the same as the decrease in SSE resulting from adding it to a model containing all the other predictors) is the UNIQUE sum of squares associated with that predictor. Thus the UNIQUE sum of squares for each predictor is equivalent to the SEQUENTIAL sum of squares for that predictor when it is entered after all other predictors in the model. This is why UNIQUE sums of squares are also referred to as partial or simultaneous sums of squares. Note that as long as there are no linear dependencies among the predictors these sums of squares will be unique. However, they will not add up to the overall model sum of squares. Let's make this all a bit more concrete with an example. Below is given a small data set, consisting of 19 cases with three variables, X1, X2 and Y. First we will obtain the default UNIQUE sums of squares for X1 and X2 as joint predictors of the dependent variable Y, using both the MANOVA and REGRESSION procedures: MANOVA Y WITH X1 X2 /ANALYSIS=Y /DESIGN=X1 X2. REGRESSION VARIABLES=Y X1 X2 /DEPENDENT=Y /METHOD=TEST (X1) (X2). ---------------------------------------------------------------------- Figure 1: Data X1 X2 Y 6 7 7 3 4 8 4 8 5 6 5 4 3 9 5 1 9 7 3 3 2 4 5 6 2 6 2 0 2 8 0 9 4 2 5 3 8 6 5 9 8 4 0 4 6 7 8 6 5 3 1 7 9 3 7 3 3 ---------------------------------------------------------------------- (End Figure 1) The slightly edited output from MANOVA and REGRESSION appears in Figure 2. Note that the sums of squares given in MANOVA are the same as those given by the METHOD=TEST output from REGRESSION. Note also that the values of the F-statistics for each variable are the squared values of the t-statistics for the estimated regression coefficients (B), and have the same significance levels. Finally, note that the sums of squares for the two variables do not add up to the overall regression or model sum of squares. As a matter of fact, in this problem they add up to MORE than the model sum of squares. While many users are confused and perplexed by the general failure of regression approach or UNIQUE sums of squares to sum to the model sum of squares, this situation (individual variable sums of squares adding up to more than the model sum of squares) is even more confusing to most people. The solution to how this can occur lies in the structure of the intercorrelations among the three variables. ---------------------------------------------------------------------- Figure 2: MANOVA and REGRESSION Output * * * * * * A n a l y s i s o f V a r i a n c e -- design 1 * * * * * * Tests of Significance for Y using UNIQUE sums of squares Source of Variation SS DF MS F Sig of F WITHIN+RESIDUAL 71.04 16 4.44 X1 4.50 1 4.50 1.01 .329 X2 1.20 1 1.20 .27 .610 (Model) 5.07 2 2.53 .57 .576 (Total) 76.11 18 4.23 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - * * * * M U L T I P L E R E G R E S S I O N * * * * Hypothesis Tests Sum of DF Squares Rsq Chg F Sig F Source 1 4.49571 .05907 1.01261 .3293 X1 1 1.19931 .01576 .27013 .6104 X2 2 5.06928 .57090 .5761 Regression 16 71.03598 Residual 18 76.10526 Total ------------------------------------------------------------ Multiple R .25809 R Square .06661 Adjusted R Square -.05007 Standard Error 2.10707 Analysis of Variance DF Sum of Squares Mean Square Regression 2 5.06928 2.53464 Residual 16 71.03598 4.43975 F = .57090 Signif F = .5761 ------------------ Variables in the Equation ------------------ Variable B SE B Beta T Sig T X1 -.178536 .177421 -.246390 -1.006 .3293 X2 .109418 .210525 .127260 .520 .6104 (Constant) 4.757000 1.422242 3.345 .0041 ---------------------------------------------------------------------- (End Figure 2) ---------------------------------------------------------------------- Figure 3: Correlation Matrix Y X1 X2 Y 1.000 -.225 .087 X1 -.225 1.000 .164 X2 .087 .164 1.000 ---------------------------------------------------------------------- (End Figure 3) As you can see in the correlation matrix listed in Figure 3, there is a positive correlation between the two predictors X1 and X2, while the two correlations with the dependent variable have opposite signs. When this occurs, the absolute values of the partial and the part (sometimes called the semipartial) correlations between each predictor and the dependent variable will be larger than the absolute value of the simple bivariate or zero order correlation between the predictor and the dependent variable, resulting in a larger sum of squares attributed to that variable when it is entered into the model after the other predictor than when it is entered first. The important point here is that the effect of "adjusting" or "controlling" a predictor for the effects of the other predictors can increase the strength of the relationship with the dependent variable instead of decreasing it. Corrective Appendix (March 1997) There is an issue of terminological accuracy here that needs clarification. The terms orthogonal and uncorrelated (or nonorthogonal and correlated) are used as if they were interchangeable. While this is true if the variables or vectors involved are centered (have mean 0), it is not true in the general case. Formally, two vectors are orthogonal if their scalar product (or inner product) is 0. They are uncorrelated if the scalar product of their centered (mean corrected) forms is 0. All four logical possibilities of these two designations are possible. That is, two vectors can be both orthogonal and uncorrelated, orthogonal but correlated, nonorthogonal but uncorrelated, or nonorthogonal and correlated. Only if the variables or vectors with which one is dealing are by definition mean corrected or centered are the two terms interchangeable. For this reason, the way the terms have been used in this article is at best sloppy and technically simply incorrect. Another clarification may also be useful for some readers. The discussion of different types of ways of computing sums of squares assumes that any categorical factors have been handled via reparameterization to full rank. In other words, the description of the sums of squares for a particular effect as being the difference between the residual sum of squares for a model with and without that term only applies when the model is handled by using K-1 dummy or effect coded variables to represent the K levels of a given factor. This approach is a common one, but it is not the only way to do things, and if one attempts to match the UNIQUE sums of squares by fitting certain models with and without particular terms in some designs using software that does not reparameterize, the expected results will not be obtained. Some of the issues involved here will be the topic of a future article discussing some of the fundamental differences between the MANOVA procedure and the new GLM procedure added to Release 7.0 of SPSS for Windows.