Page 122 Regression from chapter 7.
data lung; set "c:\pma5\lung"; ffev1a = ffev1/100; run;
proc reg data = lung; model ffev1a = fheight; run; quit;
<some output omitted> Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > |t| Intercept 1 -4.08670 1.15198 -3.55 0.0005 FHEIGHT 1 0.11811 0.01662 7.11 <.0001
Page 122 Descriptive statistics in the middle of the page.
proc means data = lung; var fage fheight ffev1a; run;
The MEANS Procedure Variable N Mean Std Dev Minimum Maximum ------------------------------------------------------------------------------- FAGE 150 40.1333333 6.8899953 26.0000000 59.0000000 FHEIGHT 150 69.2600000 2.7791892 61.0000000 76.0000000 ffev1a 150 4.0932667 0.6507523 2.5000000 5.8500000 -------------------------------------------------------------------------------
Page 127 Covariance and correlation matrices.
Covariance:
proc corr data = lung cov noprob; var fage fheight fweight; run;
<some output omitted> Covariance Matrix, DF = 149 FAGE FHEIGHT FWEIGHT FAGE 47.4720358 -1.0751678 -3.6492170 FHEIGHT -1.0751678 7.7238926 34.6954362 FWEIGHT -3.6492170 34.6954362 573.7978076
Correlation (page 127):
Pearson Correlation Coefficients, N = 150 FAGE FHEIGHT FWEIGHT FAGE 1.00000 -0.05615 -0.02211 FHEIGHT -0.05615 1.00000 0.52116 FWEIGHT -0.02211 0.52116 1.00000
Page 132 Table 7.2 ANOVA example from the lung function data (fathers).
proc reg data = lung; model ffev1a = fheight fage; run; quit;
<some output omitted> Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 2 21.05697 10.52848 36.81 <.0001 Error 147 42.04133 0.28600 Corrected Total 149 63.09830
Page 134 The t-test at the top of the page
NOTE: This is given as part of the output for the proc reg above.
Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > |t| Intercept 1 -2.76075 1.13775 -2.43 0.0165 FHEIGHT 1 0.11440 0.01579 7.25 <.0001 FAGE 1 -0.02664 0.00637 -4.18 <.0001
Page 145 Table 7.5 Statistical output for the lung function data for males and females.
NOTE: To do the top part of the table, you need to reshape the data from wide to long. Please see our FAQ on reshaping data from wide to long using a data step for a further explanation of this code.
data long; set lung; mfev1a = mfev1/100; array asex(2) fsex msex; array aage(2) fage mage; array aheight(2) fheight mheight; array afev1(2) ffev1a mfev1a; do parent = 1 to 2; sex = asex(parent); age = aage(parent); height = aheight(parent); fev1 = afev1(parent); output; end; keep id sex age height fev1; run; proc means data = long mean std; var age height fev1; run;
The MEANS Procedure Variable Mean Std Dev ---------------------------------------- age 38.8466667 6.9124837 height 66.6766667 3.6856572 fev1 3.5332000 0.8025856 ----------------------------------------
proc reg data = long; model fev1 = age height / stb; run; quit;
The REG Procedure Model: MODEL1 Dependent Variable: fev1 <some output omitted> Root MSE 0.52751 R-Square 0.5709 Dependent Mean 3.53320 Adj R-Sq 0.5680 Coeff Var 14.93008 Parameter Estimates Parameter Standard Standardized Variable DF Estimate Error t Value Pr > |t| Estimate Intercept 1 -6.73699 0.56329 -11.96 <.0001 0 age 1 -0.01860 0.00444 -4.19 <.0001 -0.16018 height 1 0.16486 0.00833 19.79 <.0001 0.75710
The second and third panels of the table can be obtained using the by statement in the proc means and proc reg. First, we need to sort the data by sex and save the sorted data file (which we called longsort). We then used proc format to create value labels for sex for clarity in the output. We called the format for sex sex, and you can tell the variable sex from the format sex because the format always ends in a period (.).
proc sort data = long out=longsort; by sex; run; proc format; value sex 1 = "male" 2 = "female"; run; proc means data = longsort mean std; by sex; format sex sex.; var age height fev1; run;
sex=male The MEANS Procedure Variable Mean Std Dev ---------------------------------------- age 40.1333333 6.8899953 height 69.2600000 2.7791892 fev1 4.0932667 0.6507523 ---------------------------------------- sex=female Variable Mean Std Dev ---------------------------------------- age 37.5600000 6.7141841 height 64.0933333 2.4695370 fev1 2.9731333 0.4874136 ----------------------------------------
proc reg data = longsort; by sex; format sex sex.; model fev1 = age height / stb; run; quit;
sex=male The REG Procedure Model: MODEL1 Dependent Variable: fev1 <some output omitted> Root MSE 0.53479 R-Square 0.3337 Dependent Mean 4.09327 Adj R-Sq 0.3247 Coeff Var 13.06500 Parameter Estimates Parameter Standard Standardized Variable DF Estimate Error t Value Pr > |t| Estimate Intercept 1 -2.76075 1.13775 -2.43 0.0165 0 age 1 -0.02664 0.00637 -4.18 <.0001 -0.28205 height 1 0.11440 0.01579 7.25 <.0001 0.48856 sex=female The REG Procedure Model: MODEL1 Dependent Variable: fev1 <some output omitted> Root MSE 0.41305 R-Square 0.2915 Dependent Mean 2.97313 Adj R-Sq 0.2819 Coeff Var 13.89275 Parameter Estimates Parameter Standard Standardized Variable DF Estimate Error t Value Pr > |t| Estimate Intercept 1 -2.21116 0.89607 -2.47 0.0147 0 age 1 -0.01998 0.00504 -3.96 0.0001 -0.27516 height 1 0.09259 0.01370 6.76 <.0001 0.46913
Page 147 middle of the page
NOTE: The t-test for the fh coefficient is the relevant statistic. The sign is opposite of that shown in the text because the order of subtraction was reversed. Also, there is some rounding error.
data long1; set long; female = sex - 1; fh = female*height; fa = female*age; run;
proc reg data = long1; model fev1 = female age height fh; run; quit;
The REG Procedure Model: MODEL1 Dependent Variable: fev1
Analysis of Variance
Sum of Mean Source DF Squares Square F Value Pr > F
Model 4 125.32516 31.33129 137.39 <.0001 Error 295 67.27377 0.22805 Corrected Total 299 192.59893
Root MSE 0.47754 R-Square 0.6507 Dependent Mean 3.53320 Adj R-Sq 0.6460 Coeff Var 13.51586
Parameter Estimates
Parameter Standard Variable DF Estimate Error t Value Pr > |t|
Intercept 1 -2.92255 0.99654 -2.93 0.0036 female 1 0.82968 1.41007 0.59 0.5567 age 1 -0.02339 0.00407 -5.75 <.0001 height 1 0.11485 0.01409 8.15 <.0001 fh 1 -0.02210 0.02121 -1.04 0.2981
page 148 middle of the page
NOTE: The F test is in the last table on the line labeled “Numerator”.
proc reg data = long1; model fev1 = female age height fa fh; test female, fa, fh; run; quit;
The REG Procedure Model: MODEL1 Dependent Variable: fev1
Analysis of Variance
Sum of Mean Source DF Squares Square F Value Pr > F
Model 5 125.47790 25.09558 109.92 <.0001 Error 294 67.12103 0.22830 Corrected Total 299 192.59893
Root MSE 0.47781 R-Square 0.6515 Dependent Mean 3.53320 Adj R-Sq 0.6456 Coeff Var 13.52345
Parameter Estimates
Parameter Standard Variable DF Estimate Error t Value Pr > |t|
Intercept 1 -2.76075 1.01653 -2.72 0.0070 female 1 0.54959 1.45182 0.38 0.7053 age 1 -0.02664 0.00569 -4.68 <.0001 height 1 0.11440 0.01411 8.11 <.0001 fa 1 0.00666 0.00815 0.82 0.4141 fh 1 -0.02180 0.02122 -1.03 0.3050
The REG Procedure Model: MODEL1
Test 1 Results for Dependent Variable fev1
Mean Source DF Square F Value Pr > F
Numerator 3 5.17471 22.67 <.0001 Denominator 294 0.22830