MY SUMS OF SQUARES DON'T ADD UP!
David P. Nichols
Senior Support Statistician
SPSS, Inc.
From SPSS Keywords, Volume 53, 1994
Many users of SPSS are confused when they see output from
REGRESSION, ANOVA or MANOVA in which the sums of squares for two
or more factors or predictors do not add up to the total sum of
squares for the model. This is especially true of users whose
statistical training is primarily concerned with the analysis of
variance of balanced data from planned experiments. Balanced, or
orthogonal, designs are ones in which the independent or
predictor variables (be they categorical factors or continuous
predictors which are sometimes referred to as covariates) are
uncorrelated in the sample. They allow for a simple additive
partitioning of the sums of squares accounted for by a model into
unique portions associated with each predictor variable. For this
reason and for reasons of precision in statistical estimation
they are generally preferred when they can be used. However, most
of us are forced at least on occasion to work with data from
messier nonorthogonal designs.
When predictor variables are nonorthogonal (correlated),
there is no unique way to compute an additive decomposition of
the model sum of squares into individual parts. Although additive
decompositions can be computed (using the SEQUENTIAL method in
MANOVA or adding up changes in sums of squares as each variable
is brought into a REGRESSION model one at a time), there is a
different decomposition associated with each possible ordering of
a set of predictor variables. In other words, the sum of squares
attributed to a particular predictor variable depends not on the
overall model, but on what predictors were entered into the model
prior to that variable. This method is primarily used when
certain variables are believed on theoretical grounds to have
priority over others, and an ordering is thus considered
appropriate. However, many users are unaware of the actual
statistical hypotheses tested by this approach, which are often
not the ones desired.
More commonly users will want to compute sums of squares for
each predictor in a symmetric fashion. These UNIQUE sums of
squares have been the default in the MANOVA procedure for some
time, and in the ANOVA procedure beginning with release 5. For
continuous predictors and categorical factors which have only two
categories (and are therefore represented by a single dummy or
effect coded variable) the F-statistics associated with UNIQUE
sums of squares are the squares of the t-statistics which result
from dividing unstandardized regression coefficient estimates (in
REGRESSION or MANOVA) or parameter estimates (in MANOVA) by their
standard errors.
Both types of sums of squares can be conceptualized and
computed as differences between the residual or error sums of
squares (SSE) resulting from fitting two hierarchical models. In
the case of SEQUENTIAL sums of squares we begin with a model
which includes only a constant or intercept term. The SSE would
then be the total mean corrected sum of squares. If we add one
predictor to the model (we can be referring here either to a
single variable or a set of variables representing a categorical
predictor) we will have a new SSE, which will be less than or
equal to the previous value. The difference between the SSE
before and after the addition of the new predictor is attributed
to this predictor. We can then add another predictor and
recalculate the model and SSE. The difference between the new SSE
and the previous SSE is again attributed to the newly added
predictor. We proceed with this process until we have entered all
predictors, noting the SSE decrease at each stage. This process
could also be performed in the reverse order, removing each
predictor and noting the increase in SSE at each stage.
The UNIQUE approach sums of squares (which can be obtained
in REGRESSION by using the TEST option on the METHOD subcommand)
can be computed by comparing the SSE for a model including all
predictors to a model including all predictors except for the one
of current interest. The increase in SSE resulting from removing
a predictor from the model (which is the same as the decrease in
SSE resulting from adding it to a model containing all the other
predictors) is the UNIQUE sum of squares associated with that
predictor. Thus the UNIQUE sum of squares for each predictor is
equivalent to the SEQUENTIAL sum of squares for that predictor
when it is entered after all other predictors in the model. This
is why UNIQUE sums of squares are also referred to as partial or
simultaneous sums of squares. Note that as long as there are no
linear dependencies among the predictors these sums of squares
will be unique. However, they will not add up to the overall
model sum of squares.
Let's make this all a bit more concrete with an example. Below
is given a small data set, consisting of 19 cases with three variables,
X1, X2 and Y. First we will obtain the default UNIQUE sums of squares
for X1 and X2 as joint predictors of the dependent variable Y, using
both the MANOVA and REGRESSION procedures:
MANOVA Y WITH X1 X2
/ANALYSIS=Y
/DESIGN=X1 X2.
REGRESSION VARIABLES=Y X1 X2
/DEPENDENT=Y
/METHOD=TEST (X1) (X2).
----------------------------------------------------------------------
Figure 1: Data
X1 X2 Y
6 7 7
3 4 8
4 8 5
6 5 4
3 9 5
1 9 7
3 3 2
4 5 6
2 6 2
0 2 8
0 9 4
2 5 3
8 6 5
9 8 4
0 4 6
7 8 6
5 3 1
7 9 3
7 3 3
----------------------------------------------------------------------
(End Figure 1)
The slightly edited output from MANOVA and REGRESSION appears in Figure
2. Note that the sums of squares given in MANOVA are the same as those
given by the METHOD=TEST output from REGRESSION. Note also that the
values of the F-statistics for each variable are the squared values of
the t-statistics for the estimated regression coefficients (B), and
have the same significance levels. Finally, note that the sums of squares
for the two variables do not add up to the overall regression or model
sum of squares. As a matter of fact, in this problem they add up to
MORE than the model sum of squares. While many users are confused and
perplexed by the general failure of regression approach or UNIQUE sums
of squares to sum to the model sum of squares, this situation (individual
variable sums of squares adding up to more than the model sum of squares)
is even more confusing to most people. The solution to how this can occur
lies in the structure of the intercorrelations among the three variables.
----------------------------------------------------------------------
Figure 2: MANOVA and REGRESSION Output
* * * * * * A n a l y s i s o f V a r i a n c e -- design 1 * * * * * *
Tests of Significance for Y using UNIQUE sums of squares
Source of Variation SS DF MS F Sig of F
WITHIN+RESIDUAL 71.04 16 4.44
X1 4.50 1 4.50 1.01 .329
X2 1.20 1 1.20 .27 .610
(Model) 5.07 2 2.53 .57 .576
(Total) 76.11 18 4.23
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
* * * * M U L T I P L E R E G R E S S I O N * * * *
Hypothesis Tests
Sum of
DF Squares Rsq Chg F Sig F Source
1 4.49571 .05907 1.01261 .3293 X1
1 1.19931 .01576 .27013 .6104 X2
2 5.06928 .57090 .5761 Regression
16 71.03598 Residual
18 76.10526 Total
------------------------------------------------------------
Multiple R .25809
R Square .06661
Adjusted R Square -.05007
Standard Error 2.10707
Analysis of Variance
DF Sum of Squares Mean Square
Regression 2 5.06928 2.53464
Residual 16 71.03598 4.43975
F = .57090 Signif F = .5761
------------------ Variables in the Equation ------------------
Variable B SE B Beta T Sig T
X1 -.178536 .177421 -.246390 -1.006 .3293
X2 .109418 .210525 .127260 .520 .6104
(Constant) 4.757000 1.422242 3.345 .0041
----------------------------------------------------------------------
(End Figure 2)
----------------------------------------------------------------------
Figure 3: Correlation Matrix
Y X1 X2
Y 1.000 -.225 .087
X1 -.225 1.000 .164
X2 .087 .164 1.000
----------------------------------------------------------------------
(End Figure 3)
As you can see in the correlation matrix listed in Figure 3, there is
a positive correlation between the two predictors X1 and X2, while the
two correlations with the dependent variable have opposite signs. When
this occurs, the absolute values of the partial and the part (sometimes
called the semipartial) correlations between each predictor and the
dependent variable will be larger than the absolute value of the simple
bivariate or zero order correlation between the predictor and the
dependent variable, resulting in a larger sum of squares attributed to
that variable when it is entered into the model after the other predictor
than when it is entered first. The important point here is that the
effect of "adjusting" or "controlling" a predictor for the effects of the
other predictors can increase the strength of the relationship with the
dependent variable instead of decreasing it.
Corrective Appendix (March 1997)
There is an issue of terminological accuracy here that needs clarification.
The terms orthogonal and uncorrelated (or nonorthogonal and correlated) are
used as if they were interchangeable. While this is true if the variables
or vectors involved are centered (have mean 0), it is not true in the
general case. Formally, two vectors are orthogonal if their scalar product
(or inner product) is 0. They are uncorrelated if the scalar product of
their centered (mean corrected) forms is 0. All four logical possibilities
of these two designations are possible. That is, two vectors can be both
orthogonal and uncorrelated, orthogonal but correlated, nonorthogonal but
uncorrelated, or nonorthogonal and correlated. Only if the variables or
vectors with which one is dealing are by definition mean corrected or
centered are the two terms interchangeable. For this reason, the way the
terms have been used in this article is at best sloppy and technically
simply incorrect.
Another clarification may also be useful for some readers. The discussion
of different types of ways of computing sums of squares assumes that any
categorical factors have been handled via reparameterization to full rank.
In other words, the description of the sums of squares for a particular
effect as being the difference between the residual sum of squares for a
model with and without that term only applies when the model is handled
by using K-1 dummy or effect coded variables to represent the K levels of
a given factor. This approach is a common one, but it is not the only way
to do things, and if one attempts to match the UNIQUE sums of squares by
fitting certain models with and without particular terms in some designs
using software that does not reparameterize, the expected results will
not be obtained. Some of the issues involved here will be the topic of a
future article discussing some of the fundamental differences between the
MANOVA procedure and the new GLM procedure added to Release 7.0 of SPSS
for Windows.
