Page 271 Figure 12.1 Logistic function for the depression data set.
NOTE: We were unable to reproduce this graph.
Page 273 Table 12.1 Classification of individuals by depression level and sex.
data depress;
set "c:\pma5\depress";
run;
proc freq data = depress;
tables sex*cases;
run;
The FREQ Procedure
Table of SEX by CASES
SEX CASES
Frequency|
Percent |
Row Pct |
Col Pct | 0| 1| Total
---------+--------+--------+
1 | 101 | 10 | 111
| 34.35 | 3.40 | 37.76
| 90.99 | 9.01 |
| 41.39 | 20.00 |
---------+--------+--------+
2 | 143 | 40 | 183
| 48.64 | 13.61 | 62.24
| 78.14 | 21.86 |
| 58.61 | 80.00 |
---------+--------+--------+
Total 244 50 294
82.99 17.01 100.00
Page 274 Odds ratios and coefficients
data depress;
set depress;
sex1 = sex - 1;
run;
proc logistic data = depress desc;
model cases = sex1;
run;
The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 -2.3125 0.3315 48.6603 <.0001
sex1 1 1.0385 0.3767 7.6013 0.0058
Odds Ratio Estimates
Point 95% Wald
Effect Estimate Confidence Limits
sex1 2.825 1.350 5.911
(some output omitted)
Page 275 The coefficients at the bottom of the page.
proc logistic data = depress desc;
model cases = age income;
run;
The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 0.0280 0.4872 0.0033 0.9542
AGE 1 -0.0202 0.00890 5.1385 0.0234
INCOME 1 -0.0413 0.0141 8.6500 0.0033
(some output omitted)
Page 276 These numbers are obtained from the output from page 275.
Page 278 Table at the top of the page
NOTE: We will create the interaction of the two dummy variables (which we called dincemp) in this data step for use in the example on page 279.
data depress;
set depress;
if income >= 10 then duminc = 0;
else duminc = 1;
if employ = 2 or employ = 3 then dumemp = 1;
else dumemp = 0;
if employ = 7 then dumemp = .;
dincemp = duminc*dumemp;
run;
proc logistic data = depress desc;
model cases = duminc dumemp;
run;
quit;
The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 -1.9345 0.2259 73.3313 <.0001
duminc 1 0.2723 0.3377 0.6502 0.4200
dumemp 1 1.0285 0.3487 8.6990 0.0032
Odds Ratio Estimates
Point 95% Wald
Effect Estimate Confidence Limits
duminc 1.313 0.677 2.545
dumemp 2.797 1.412 5.540
(some output omitted)
Page 279 middle of the page
proc logistic data = depress desc;
model cases = duminc dumemp dincemp;
run;
quit;
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 16.8347 3 0.0008
Score 22.4086 3 <.0001
Wald 16.8136 3 0.0008
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 8.6045 2 0.0135
Score 9.5814 2 0.0083
Wald 9.0619 2 0.0108
The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 -1.7346 0.2214 61.3804 <.0001
duminc 1 -0.3756 0.4349 0.7458 0.3878
dumemp 1 0.3175 0.4520 0.4935 0.4824
dincemp 1 2.1981 0.7888 7.7651 0.0053
Odds Ratio Estimates
Point 95% Wald
Effect Estimate Confidence Limits
duminc 0.687 0.293 1.611
dumemp 1.374 0.566 3.332
dincemp 9.008 1.919 42.276
(some output omitted)
Page 280 bottom of the page
NOTE: The likelihood ratio chi-square values needed are given in the output for the two models shown above: 16.83-8.6 = 8.23.
Page 287 middle of the page
data depress;
set depress;
if age < 28 then age0 = 1;
else age0 = 0;
if age >=28 & age <= 42 then age1 = 1;
else age1 = 0;
if age >=43 & age <= 58 then age2 = 1;
else age2 = 0;
if age >=59 & age <= 89 then age3 = 1;
else age3 = 0;
run;
proc logistic data = depress desc;
model cases = age1 age2 age3 income sex;
run;
quit;
The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 -2.1595 0.7830 7.6056 0.0058
age1 1 0.0747 0.4318 0.0299 0.8626
age2 1 -0.5706 0.4744 1.4468 0.2290
age3 1 -0.8853 0.4563 3.7643 0.0524
income 1 -0.0380 0.0149 6.5298 0.0106
sex 1 0.9238 0.3864 5.7147 0.0168
Page 289 Figure 12.2 Estimated coefficients for age quartiles by midpoint of the quartile
NOTE: We need to use ODS to capture the coefficients in a data set. We use the ods trace on and ods trace off statements so that SAS prints the names of the various tables in the log. Then we can look there to get the name of the table that we need to include on the ods output statement. The print procedures are not necessary; they just help see what the data sets look like before the next addition or modification.
ods trace on;
proc logistic data = depress desc;
model cases = age1 age2 age3 income sex;
ods output ParameterEstimates = parms1;
run;
quit;
ods trace off;
proc print data = parms1;
run;
data parms1;
if _n_ = 1 then do;
variable = "age0";
estimate = 0;
end;
output;
set parms1;
run;
data parms;
set parms1;
if variable = "age0" then newage = 22.5;
if variable = "age1" then newage = 35;
if variable = "age2" then newage = 50.5;
if variable = "age3" then newage = 74;
if variable in("age0" "age1" "age2" "age3");
run;
proc print data = parms;
run;
axis1 label=(a=90 'Coefficient b') order = (-1 to .5 by .5);
axis2 label=("Age") order = (20 to 80 by 10);
symbol1 i=join v=dot;
proc gplot data = parms;
plot estimate*newage / vaxis=axis1 haxis = axis2;
run;
quit;

Page 291 Figure 12.3 Delta beta measures to assess the influence of individual patterns on estimated coefficients
NOTE: We are including the difchisq (delta chi-square) statistic here for use on page 294.
proc logistic data = depress desc; model cases = sex income age; output out=pred p=estprob c=deltabeta DIFCHISQ=deltachi; run; quit; symbol1 i=none v=circle ; axis1 order=(0 to .5 by .1) ; axis2 label=(angle=90) order=(0 to .25 by .05); proc gplot data=pred; plot deltabeta*estprob / haxis=axis1 vaxis=axis2; run; quit;
Page 292 Table 12.2 Percent change in estimated parameters when including and excluding influential patterns
*line 1 of table;
proc logistic data = depress desc;
model cases = age income sex;
run;
quit;
The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 -1.6059 0.8465 3.5987 0.0578
age 1 -0.0210 0.00904 5.3744 0.0204
income 1 -0.0366 0.0141 6.7343 0.0095
sex 1 0.9294 0.3858 5.8032 0.0160
proc sort data = pred1;
by deltabeta;
run;
proc print data = pred1(firstobs=292);
var id deltabeta;
run;
Obs id deltabeta
292 288 0.16373
293 99 0.17896
294 68 0.23899
data pred1;
set pred;
x = 0;
if deltabeta > .1637310 then x = 3;
if deltabeta > .1789604 then x = 2;
if deltabeta > .2084397 then x = 1;
run;
proc print data = pred1;
var x deltabeta;
where x ne 0;
run;
Obs x deltabeta
292 3 0.16373
293 2 0.17896
294 1 0.23899
* line 2 of table;
proc logistic data = pred1 desc;
model cases = age income sex;
where x ne 1;
run;
quit;
The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 -1.6991 0.8737 3.7818 0.0518
age 1 -0.0215 0.00912 5.5826 0.0181
income 1 -0.0421 0.0150 7.8971 0.0050
sex 1 1.0301 0.4008 6.6050 0.0102
* line 3 of table;
proc logistic data = pred1 desc;
model cases = age income sex;
where x ne 2;
run;
quit;
The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 -1.7570 0.8712 4.0674 0.0437
age 1 -0.0234 0.00925 6.4023 0.0114
income 1 -0.0358 0.0142 6.3918 0.0115
sex 1 1.0505 0.4008 6.8707 0.0088
* line 4 of table;
proc logistic data = pred1 desc;
model cases = age income sex;
where x ne 3;
run;
quit;
The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 -1.7138 0.8721 3.8619 0.0494
age 1 -0.0229 0.00920 6.1894 0.0129
income 1 -0.0389 0.0145 7.1596 0.0075
sex 1 1.0419 0.4009 6.7562 0.0093
* line 5 of table;
proc logistic data = pred1 desc;
model cases = age income sex;
where x = 0;
run;
quit;
The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 -2.0252 0.9419 4.6233 0.0315
age 1 -0.0263 0.00953 7.6125 0.0058
income 1 -0.0443 0.0156 8.0351 0.0046
sex 1 1.3094 0.4407 8.8267 0.0030
Page 293 Table 12.3 Estimated probability of being a case (p-hat) for five influential observations
proc print data = pred1 noobs round; var id age income sex cases estprob; where id = 288 or id = 99 or id = 143 or id = 232 or id = 68; run; 143 40 45 1 0 0.04 232 40 45 1 0 0.04 288 61 28 1 1 0.05 99 72 11 1 1 0.07 68 40 45 1 1 0.04
Page 294 Figure 12.4 Delta chi-square measure to assess influence of pattern on overall fit with symbol size proportional to delta beta
NOTE: This graph looks slightly different from the graph in the text. This is probably because SAS and Stata calculate the covariate patterns in different ways.
symbol1 color=black interpol=r value=circle height=1; axis1 order=(0 to 25 by 5) label=(angle=90 color=black height=0.75); axis2 order=(0 to .5 by .1); proc gplot data=pred; bubble deltachi*estprob=deltabeta / bsize=20 haxis=axis2 vaxis=axis1; run; quit;
Page 296 Figure 12.5 Percentage of individuals correctly classified by logistic regression.
NOTE: We were unable to reproduce this graph.
Page 297 Figure 12.6 ROC curve from logistic regression for the depression data set.
NOTE: We were unable to reproduce this graph.


