Regression Analysis by Example, Third Edition Chapter 2: Simple Linear Regression

Table 2.3, page 25: A data set with a perfect nonlinear relationship between Y and X, yet Cor(X,Y) = 0

get file 'D:p025a.sav'.
list.

       y        x

       1       -7
      14       -6
      25       -5
      34       -4
      41       -3
      46       -2
      49       -1
      50        0
      49        1
      46        2
      41        3
      34        4
      25        5
      14        6
       1        7

Number of cases read:  15    Number of cases listed:  15

Figure 2.2, page 25: A scatter plot of Y versus X in Table 2.3

graph 
/scatter = x with y.

Table 2.4, page 25 Anscombe’s Quartet: Four data sets having same values of summary statistics

get file 'D:p025b.sav'.
list.

       y1       x1        y2       x2        y3       x3        y4       x4

     8.04       10      9.14       10      7.46       10      6.58        8
     6.95        8      8.14        8      6.77        8      5.76        8
     7.58       13      8.74       13     12.74       13      7.71        8
     8.81        9      8.77        9      7.11        9      8.84        8
     8.33       11      9.26       11      7.81       11      8.47        8
     9.96       14      8.10       14      8.84       14      7.04        8
     7.24        6      6.13        6      6.08        6      5.25        8
     4.26        4      3.10        4      5.39        4     12.50       19
    10.84       12      9.13       12      8.15       12      5.56        8
     4.82        7      7.26        7      6.42        7      7.91        8
     5.68        5      4.74        5      5.73        5      6.89        8

Number of cases read:  11    Number of cases listed:  11

Figure 2.3, page 26: Scatter plots of the data in Table 2.4 with the fitted lines

graph (a)

formats y1 x1 y2 x2 y3 x3 y4 x4 (f2.0).
GGRAPH
  /GRAPHDATASET NAME="GraphDataset" VARIABLES= y1 x1 
  /GRAPHSPEC SOURCE=INLINE 
      INLINETEMPLATE=["<addFitLine  type='linear' target='pair'/> " ].
BEGIN GPL
SOURCE: s=userSource( id( "GraphDataset" ) )
DATA: y1=col( source(s), name( "y1" ) )
DATA: x1=col( source(s), name( "x1" ) )
GUIDE: axis( dim( 1 ), label( "x1" ) )
GUIDE: axis( dim( 2 ), label( "y1" ) )
ELEMENT: point( position( ( X1_Var * Y_Var ) ) )
END GPL.

Image spss_chp2_2_ggraph

graph (b)

GGRAPH
  /GRAPHDATASET NAME="GraphDataset" VARIABLES= y2 x2 
  /GRAPHSPEC SOURCE=INLINE  INLINETEMPLATE=["<addFitLine  type='linear' target='pair'/> "].
BEGIN GPL
SOURCE: s=userSource( id( "GraphDataset" ) )
DATA: y2=col( source(s), name( "y2" ) )
DATA: x2=col( source(s), name( "x2" ) )
GUIDE: axis( dim( 1 ), label( "x2" ) )
GUIDE: axis( dim( 2 ), label( "y2" ) )
ELEMENT: point( position( (x2 * y2 ) ) )
END GPL.

Image spss_chp2_3_ggraph

graph (c)

GGRAPH
  /GRAPHDATASET NAME="GraphDataset" VARIABLES= y3 x3 
  /GRAPHSPEC SOURCE=INLINE INLINETEMPLATE=["<addFitLine  type='linear' target='pair'/> " ].
BEGIN GPL
SOURCE: s=userSource( id( "GraphDataset" ) )
DATA: y3=col( source(s), name( "y3" ) )
DATA: x3=col( source(s), name( "x3" ) )
GUIDE: axis( dim( 1 ), label( "x3" ) )
GUIDE: axis( dim( 2 ), label( "y3" ) )
ELEMENT: point( position( ( x3 * y3 ) ) )
END GPL.

Image spss_chp2_4_ggraph

graph (d)

GGRAPH
  /GRAPHDATASET NAME="GraphDataset" VARIABLES= y4 x4 
  /GRAPHSPEC SOURCE=INLINE INLINETEMPLATE=["<addFitLine  type='linear' target='pair'/> "].
BEGIN GPL
SOURCE: s=userSource( id( "GraphDataset" ) )
DATA: y4=col( source(s), name( "y4" ) )
DATA: x4=col( source(s), name("x4") )
GUIDE: axis( dim( 1 ), label( "x4" ) )
GUIDE: axis( dim( 2 ), label( "y4" ) )
ELEMENT: point( position( ( x4 * y4 ) ) )
END GPL.

Image spss_chp2_5_ggraph

Table 2.5, page 27 length of service calls (in minutes) and number of units repaired

get file 'D:p027.sav'.
list.

 minutes    units

      23        1
      29        2
      49        3
      64        4
      74        4
      87        5
      96        6
      97        6
     109        7
     119        8
     149        9
     145        9
     154       10
     166       10

Number of cases read:  14    Number of cases listed:  14

Table 2.6, Page 28: Quantities needed for the computation of the correlation coefficient between the
length of service calls, Y and the number of units repaired, X

compute const = 1.
exe.
aggregate outfile "d:p027ag.sav"
 /break =const
 /ymean = mean(minutes)
 /xmean = mean(units).
match files file =  * 
 /table = "d:p027ag.sav"
 /by const.
exe.
compute yd = minutes - ymean.
compute xd = units - xmean.
compute yd2 = yd**2.
compute xd2 = xd**2.
compute xyd = xd*yd.
exe.
list minutes units yd to xyd.

 minutes    units       yd       xd      yd2      xd2      xyd

      23        1   -74.21    -5.00  5507.76    25.00   371.07
      29        2   -68.21    -4.00  4653.19    16.00   272.86
      49        3   -48.21    -3.00  2324.62     9.00   144.64
      64        4   -33.21    -2.00  1103.19     4.00    66.43
      74        4   -23.21    -2.00   538.90     4.00    46.43
      87        5   -10.21    -1.00   104.33     1.00    10.21
      96        6    -1.21      .00     1.47      .00      .00
      97        6     -.21      .00      .05      .00      .00
     109        7    11.79     1.00   138.90     1.00    11.79
     119        8    21.79     2.00   474.62     4.00    43.57
     149        9    51.79     3.00  2681.76     9.00   155.36
     145        9    47.79     3.00  2283.47     9.00   143.36
     154       10    56.79     4.00  3224.62    16.00   227.14
     166       10    68.79     4.00  4731.47    16.00   275.14

Number of cases read:  14    Number of cases listed:  14

descriptive variables = minutes to xyd
 /statistics = sum.

Figure 2.4, page 26 computer repair data: scatter plot of minutes versus units

graph
 /scatter units with minutes.

Image spss_chp2_6

Table 2.7, page 32 the fitted values of, yhat, and the ordinary least squares residuals, e, for the repair data

regression 
 /dependent = minutes
 /method = enter units
 /save resid (e) pred (yhat) sepred (semu).

list units yhat e.

   units        yhat           e

       1    19.67043     3.32957
       2    35.17920    -6.17920
       3    50.68797    -1.68797
       4    66.19674    -2.19674
       4    66.19674     7.80326
       5    81.70551     5.29449
       6    97.21429    -1.21429
       6    97.21429     -.21429
       7   112.72306    -3.72306
       8   128.23183    -9.23183
       9   143.74060     5.25940
       9   143.74060     1.25940
      10   159.24937    -5.24937
      10   159.24937     6.75063

Number of cases read:  14    Number of cases listed:  14

Figure 2.5, page 32 plot of minutes versus units with the fitted least squares regression line

GGRAPH
  /GRAPHDATASET NAME="GraphDataset" VARIABLES= minutes units 
  /GRAPHSPEC SOURCE=INLINE 
      INLINETEMPLATE=["<addFitLine  type='linear' target='pair'/> "].
BEGIN GPL
SOURCE: s=userSource( id( "GraphDataset" ) )
DATA: minutes=col( source(s), name( "minutes" ) )
DATA: units=col( source(s), name( "units" ) )
GUIDE: axis( dim( 1 ), label( "units" ) )
GUIDE: axis( dim( 2 ), label( "minutes" ) )
ELEMENT: point( position( ( units * minutes ) ) )
END GPL.

Image spss_chp2_7_ggraph

Table 2.8, page 36 regression output for the computer repair data

regression 
 /statistics coef
 /dependent = minutes
 /method = enter units.

Standard error for mean prediction, page 39

list units minutes semu.

   units  minutes        semu

       1       23     2.90717
       2       29     2.48124
       3       49     2.09082
       4       64     1.75969
       4       74     1.75969
       5       87     1.52692
       6       96     1.44100
       6       97     1.44100
       7      109     1.52692
       8      119     1.75969
       9      149     2.09082
       9      145     2.09082
      10      154     2.48124
      10      166     2.48124

Number of cases read:  14    Number of cases listed:  14

Correlations, page 43

correlations variables = minutes units yhat.

NOTE: (.994)^2 = .987 = R^2