3.2 Dichotomous independent variable
page 51 Table 3.2 Cross-classification of AGE dichotomized at 55 years and CHD for 100 subjects.
data chdage31;
set 'd:hosmerdatachdage';
aged=0;
if age ge 55 then aged=1;
run;
proc sort data=chdage31 out=chdage32;
by aged;
run;
proc freq data=chdage32;
tables chd*aged;
run;
The FREQ Procedure
Table of CHD by aged
CHD aged
Frequency|
Percent |
Row Pct |
Col Pct | 0| 1| Total
---------+--------+--------+
0 | 51 | 6 | 57
| 51.00 | 6.00 | 57.00
| 89.47 | 10.53 |
| 69.86 | 22.22 |
---------+--------+--------+
1 | 22 | 21 | 43
| 22.00 | 21.00 | 43.00
| 51.16 | 48.84 |
| 30.14 | 77.78 |
---------+--------+--------+
Total 73 27 100
73.00 27.00 100.00
page 52 Table 3.3 Results of fitting the logistic regression model to the data in Table 3.2.
NOTE: To get the Wald tests shown in the text, take the square root of the chi-squares given in the SAS output.
NOTE: We have bolded the relevant output.
proc logistic data=chdage32 desc;
model chd = aged;
run;
quit;
The LOGISTIC Procedure
Model Information
Data Set WORK.CHDAGE32
Response Variable CHD
Number of Response Levels 2
Number of Observations 100
Link Function Logit
Optimization Technique Fisher's scoring
Response Profile
Ordered Total
Value CHD Frequency
1 1 43
2 0 57
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics
Intercept
Intercept and
Criterion Only Covariates
AIC 138.663 121.959
SC 141.268 127.169
-2 Log L 136.663 117.959
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 18.7039 1 <.0001
Score 18.2516 1 <.0001
Wald 15.6898 1 <.0001
Analysis of Maximum Likelihood Estimates
Standard
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 -0.8408 0.2551 10.8652 0.0010
aged 1 2.0935 0.5285 15.6898 <.0001
The LOGISTIC Procedure
Odds Ratio Estimates
Point 95% Wald
Effect Estimate Confidence Limits
aged 8.114 2.880 22.861
Association of Predicted Probabilities and Observed Responses
Percent Concordant 43.7 Somers' D 0.383
Percent Discordant 5.4 Gamma 0.781
Percent Tied 50.9 Tau-a 0.190
Pairs 2451 c 0.692
3.3 Polychotomous Independent Variable
page 56 Table 3.5 Cross-classification of hypothetical data on RACE and CHD status for 100 subjects.
data hypothet1;
input race chd cnt;
cards;
1 1 5
2 1 20
3 1 15
4 1 10
1 0 20
2 0 10
3 0 10
4 0 10
;
run;
proc freq data=hypothet1;
tables chd*race;
weight cnt;
run;
The FREQ Procedure
Table of chd by race
chd race
Frequency|
Percent |
Row Pct |
Col Pct | 1| 2| 3| 4| Total
---------+--------+--------+--------+--------+
0 | 20 | 10 | 10 | 10 | 50
| 20.00 | 10.00 | 10.00 | 10.00 | 50.00
| 40.00 | 20.00 | 20.00 | 20.00 |
| 80.00 | 33.33 | 40.00 | 50.00 |
---------+--------+--------+--------+--------+
1 | 5 | 20 | 15 | 10 | 50
| 5.00 | 20.00 | 15.00 | 10.00 | 50.00
| 10.00 | 40.00 | 30.00 | 20.00 |
| 20.00 | 66.67 | 60.00 | 50.00 |
---------+--------+--------+--------+--------+
Total 25 30 25 20 100
25.00 30.00 25.00 20.00 100.00
data hypothet2;
set hypothet1;
if race = 1 then do; race2 = 0; race3 = 0; race4 = 0; end;
if race = 2 then do; race2 = 1; race3 = 0; race4 = 0; end;
if race = 3 then do; race2 = 0; race3 = 1; race4 = 0; end;
if race = 4 then do; race2 = 0; race3 = 0; race4 = 1; end;
run;
proc logistic data=hypothet2 desc;
model chd = race2 race3 race4;
weight cnt;
run;
quit;
The LOGISTIC Procedure
Model Information
Data Set WORK.HYPOTHET2
Response Variable chd
Number of Response Levels 2
Number of Observations 8
Weight Variable cnt
Sum of Weights 100
Link Function Logit
Optimization Technique Fisher's scoring
Response Profile
Ordered Total Total
Value chd Frequency Weight
1 1 4 50.000000
2 0 4 50.000000
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics
Intercept
Intercept and
Criterion Only Covariates
AIC 140.629 132.587
SC 140.709 132.905
-2 Log L 138.629 124.587
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 14.0420 3 0.0028
Score 13.3333 3 0.0040
Wald 11.7715 3 0.0082
The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Standard
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 -1.3863 0.5000 7.6871 0.0056
race2 1 2.0794 0.6325 10.8100 0.0010
race3 1 1.7917 0.6455 7.7048 0.0055
race4 1 1.3863 0.6708 4.2706 0.0388
Odds Ratio Estimates
Point 95% Wald
Effect Estimate Confidence Limits
race2 8.000 2.316 27.633
race3 6.000 1.693 21.261
race4 4.000 1.074 14.895
Association of Predicted Probabilities and Observed Responses
Percent Concordant 37.5 Somers' D 0.000
Percent Discordant 37.5 Gamma 0.000
Percent Tied 25.0 Tau-a 0.000
Pairs 16 c 0.500
page 57 Table 3.6 Specification of the design variables for RACE using reference cell coding with white as the reference group.
proc print data=hypothet2 (obs=4); var race race2 race3 race4; run; Obs race race2 race3 race4 1 1 0 0 0 2 2 1 0 0 3 3 0 1 0 4 4 0 0 1
page 58 Table 3.7 Results of fitting the logistic regression model to the data in Table 3.5 using the design variables in Table 3.6.
proc logistic data=hypothet2 desc;
model chd = race2 race3 race4;
weight cnt;
run;
quit;
The LOGISTIC Procedure
Model Information
Data Set WORK.HYPOTHET2
Response Variable chd
Number of Response Levels 2
Number of Observations 8
Weight Variable cnt
Sum of Weights 100
Link Function Logit
Optimization Technique Fisher's scoring
Response Profile
Ordered Total Total
Value chd Frequency Weight
1 1 4 50.000000
2 0 4 50.000000
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics
Intercept
Intercept and
Criterion Only Covariates
AIC 140.629 132.587
SC 140.709 132.905
-2 Log L 138.629 124.587
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 14.0420 3 0.0028
Score 13.3333 3 0.0040
Wald 11.7715 3 0.0082
The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Standard
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 -1.3863 0.5000 7.6871 0.0056
race2 1 2.0794 0.6325 10.8100 0.0010
race3 1 1.7917 0.6455 7.7048 0.0055
race4 1 1.3863 0.6708 4.2706 0.0388
Odds Ratio Estimates
Point 95% Wald
Effect Estimate Confidence Limits
race2 8.000 2.316 27.633
race3 6.000 1.693 21.261
race4 4.000 1.074 14.895
Association of Predicted Probabilities and Observed Responses
Percent Concordant 37.5 Somers' D 0.000
Percent Discordant 37.5 Gamma 0.000
Percent Tied 25.0 Tau-a 0.000
Pairs 16 c 0.500
page 59 Table 3.8 Specification of the design variables for RACE using deviation from means coding.
data hypothet2; set hypothet1; if race = 1 then do; race2 = -1; race3 = -1; race4 = -1; end; if race = 2 then do; race2 = 1; race3 = 0; race4 = 0; end; if race = 3 then do; race2 = 0; race3 = 1; race4 = 0; end; if race = 4 then do; race2 = 0; race3 = 0; race4 = 1; end; run; proc print data=hypothet2 (obs=4); var race race2 race3 race4; run; Obs race race2 race3 race4 1 1 -1 -1 -1 2 2 1 0 0 3 3 0 1 0 4 4 0 0 1
page 60 Table 3.9 Results of fitting the logistic regression model to the data in Table 3.5 using the design variables in Table 3.8.
NOTE: To get the Wald tests shown in the text, take the square root of the chi-squares given in the SAS output. If the coefficient is negative, then you need to put the negative sign in front of the result of the square root.
proc logistic data=hypothet2 desc;
model chd = race2 race3 race4;
weight cnt;
run;
quit;
The LOGISTIC Procedure
Model Information
Data Set WORK.HYPOTHET2
Response Variable chd
Number of Response Levels 2
Number of Observations 8
Weight Variable cnt
Sum of Weights 100
Link Function Logit
Optimization Technique Fisher's scoring
Response Profile
Ordered Total Total
Value chd Frequency Weight
1 1 4 50.000000
2 0 4 50.000000
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics
Intercept
Intercept and
Criterion Only Covariates
AIC 140.629 132.587
SC 140.709 132.905
-2 Log L 138.629 124.587
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 14.0420 3 0.0028
Score 13.3333 3 0.0040
Wald 11.7715 3 0.0082
The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Standard
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 -0.0719 0.2189 0.1079 0.7425
race2 1 0.7651 0.3506 4.7619 0.0291
race3 1 0.4774 0.3623 1.7363 0.1876
race4 1 0.0719 0.3846 0.0350 0.8517
Odds Ratio Estimates
Point 95% Wald
Effect Estimate Confidence Limits
race2 2.149 1.081 4.273
race3 1.612 0.792 3.279
race4 1.075 0.506 2.284
Association of Predicted Probabilities and Observed Responses
Percent Concordant 37.5 Somers' D 0.000
Percent Discordant 37.5 Gamma 0.000
Percent Tied 25.0 Tau-a 0.000
Pairs 16 c 0.500
3.5 The multivariable model
page 67 Table 3.10 Descriptive statistics for two groups of 50 men on AGE and whether they had seen a physician (PHY) (1 = yes, 0 = no) within the last six months.
NOTE: These data are hypothetical and are not available.
page 69 Table 3.11 Results of fitting the logistic regression model to the data summarized in Table 3.10.
NOTE: These data are hypothetical and are not available.
3.6 Interaction and confounding
page 72 Table 3.12 Estimated logistic regression coefficients, deviance, and the likelihood ratio test statistic (G) for an example showing evidence of confounding but no interaction (n = 400).
NOTE: These data are hypothetical and are not available.
page 73 Table 3.13 Estimated logistic regression coefficients, deviance, and the likelihood ratio test statistic (G) for an example showing evidence of confounding and interaction (n = 400).
NOTE: These data are hypothetical and are not available.
3.7 Estimation of odds ratios in the presence of interaction
page 77 Table 3.14 Estimated logistic regression coefficients, deviance, and the likelihood ratio test statistic (G), and the p-value for the change for models containing lwd and age from the low birthweight data (n = 189).
NOTE: You need to calculate G by hand by subtracting the -2 log likelihood for the reduced model from the full model.
data lowbwt31;
set 'd:hosmerdatalowbwt';
if race = 1 then do; race2 = 0; race3 = 0; end;
if race = 2 then do; race2 = 1; race3 = 0; end;
if race = 3 then do; race2 = 0; race3 = 1; end;
lwd=(lwt<110);
run;
proc logistic data=lowbwt31 descending;
model low = lwd age lwd*age;
output out=lowbwt32 predicted=pred;
run;
The LOGISTIC Procedure
Model Information
Data Set WORK.LOWBWT31
Response Variable LOW < 2500g
Number of Response Levels 2
Number of Observations 189
Link Function Logit
Optimization Technique Fisher's scoring
Response Profile
Ordered Total
Value LOW Frequency
1 1 59
2 0 130
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics
Intercept
Intercept and
Criterion Only Covariates
AIC 236.672 229.140
SC 239.914 242.107
-2 Log L 234.672 221.140
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 13.5321 3 0.0036
Score 13.3565 3 0.0039
Wald 12.3553 3 0.0063
The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Standard
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 0.7745 0.9101 0.7241 0.3948
lwd 1 -1.9440 1.7248 1.2704 0.2597
AGE 1 -0.0796 0.0396 4.0305 0.0447
lwd*AGE 1 0.1322 0.0757 3.0497 0.0808
Association of Predicted Probabilities and Observed Responses
Percent Concordant 64.3 Somers' D 0.317
Percent Discordant 32.6 Gamma 0.327
Percent Tied 3.1 Tau-a 0.137
Pairs 7670 c 0.659
proc logistic data=lowbwt31 descending;
model low = lwd age;
run;
The LOGISTIC Procedure
Model Information
Data Set WORK.LOWBWT31
Response Variable LOW < 2500g
Number of Response Levels 2
Number of Observations 189
Link Function Logit
Optimization Technique Fisher's scoring
Response Profile
Ordered Total
Value LOW Frequency
1 1 59
2 0 130
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics
Intercept
Intercept and
Criterion Only Covariates
AIC 236.672 230.287
SC 239.914 240.012
-2 Log L 234.672 224.287
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 10.3852 2 0.0056
Score 10.6703 2 0.0048
Wald 10.0831 2 0.0065
The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Standard
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 -0.0269 0.7621 0.0012 0.9719
lwd 1 1.0101 0.3643 7.6899 0.0056
AGE 1 -0.0442 0.0322 1.8841 0.1699
Odds Ratio Estimates
Point 95% Wald
Effect Estimate Confidence Limits
lwd 2.746 1.345 5.607
AGE 0.957 0.898 1.019
Association of Predicted Probabilities and Observed Responses
Percent Concordant 62.8 Somers' D 0.288
Percent Discordant 34.1 Gamma 0.297
Percent Tied 3.1 Tau-a 0.124
Pairs 7670 c 0.644
proc logistic data=lowbwt31 descending;
model low = lwd;
run;
The LOGISTIC Procedure
Model Information
Data Set WORK.LOWBWT31
Response Variable LOW < 2500g
Number of Response Levels 2
Number of Observations 189
Link Function Logit
Optimization Technique Fisher's scoring
Response Profile
Ordered Total
Value LOW Frequency
1 1 59
2 0 130
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics
Intercept
Intercept and
Criterion Only Covariates
AIC 236.672 230.241
SC 239.914 236.725
-2 Log L 234.672 226.241
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 8.4308 1 0.0037
Score 8.8727 1 0.0029
Wald 8.4917 1 0.0036
Analysis of Maximum Likelihood Estimates
Standard
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 -1.0537 0.1884 31.2860 <.0001
lwd 1 1.0536 0.3616 8.4917 0.0036
The LOGISTIC Procedure
Odds Ratio Estimates
Point 95% Wald
Effect Estimate Confidence Limits
lwd 2.868 1.412 5.826
Association of Predicted Probabilities and Observed Responses
Percent Concordant 29.8 Somers' D 0.194
Percent Discordant 10.4 Gamma 0.483
Percent Tied 59.8 Tau-a 0.084
Pairs 7670 c 0.597
proc logistic data=lowbwt31 descending;
model low=;
run;
quit;
The LOGISTIC Procedure
Model Information
Data Set WORK.LOWBWT31
Response Variable LOW < 2500g
Number of Response Levels 2
Number of Observations 189
Link Function Logit
Optimization Technique Fisher's scoring
Response Profile
Ordered Total
Value LOW Frequency
1 1 59
2 0 130
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
-2 Log L = 234.672
Analysis of Maximum Likelihood Estimates
Standard
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 -0.7900 0.1570 25.3270 <.0001
page 78 Figure 3.3 Plot of the estimated logit for women with LWD = 1 and for women with LWD = from Model 3 in Table 3.17.
data lowbwt31;
infile 'D:workdatarawlogisticlowbwt.dat';
input id low age lwt race smoke ptd ht ui ftv bwt;
if race = 1 then do; race2 = 0; race3 = 0; end;
if race = 2 then do; race2 = 1; race3 = 0; end;
if race = 3 then do; race2 = 0; race3 = 1; end;
lwd=(lwt<110);
run;
proc logistic data=lowbwt31 descending;
model low = lwd age lwd*age;
output out=lowbwt32 xbeta=xb;
run;
proc sort data=lowbwt32;
by xb;
run;
symbol1 i=join value=circle;
axis1 label = (a=90 'estimated logit');
axis2 label = ("age");
proc gplot data=lowbwt32;
plot xb*age=1 /vaxis=axis1 haxis = axis2;
run;
quit;
page 78 Table 3.15 Estimated covariance matrix for the estimated parameters in Model 3 of Table 3.14.
proc logistic data=lowbwt31 descending covout outest=lowbwt33;
model low = lwd age lwd*age;
run;
quit;
The LOGISTIC Procedure
Model Information
Data Set WORK.LOWBWT31
Response Variable LOW < 2500g
Number of Response Levels 2
Number of Observations 189
Link Function Logit
Optimization Technique Fisher's scoring
Response Profile
Ordered Total
Value LOW Frequency
1 1 59
2 0 130
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics
Intercept
Intercept and
Criterion Only Covariates
AIC 236.672 229.140
SC 239.914 242.107
-2 Log L 234.672 221.140
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 13.5321 3 0.0036
Score 13.3565 3 0.0039
Wald 12.3553 3 0.0063
The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Standard
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 0.7745 0.9101 0.7241 0.3948
lwd 1 -1.9440 1.7248 1.2704 0.2597
AGE 1 -0.0796 0.0396 4.0305 0.0447
lwd*AGE 1 0.1322 0.0757 3.0497 0.0808
Association of Predicted Probabilities and Observed Responses
Percent Concordant 64.3 Somers' D 0.317
Percent Discordant 32.6 Gamma 0.327
Percent Tied 3.1 Tau-a 0.137
Pairs 7670 c 0.659
proc print data=lowbwt33;
where _type_='COV';
var _name_ intercept lwd age lwdage;
run;
Obs _NAME_ Intercept lwd AGE lwdAGE
2 Intercept 0.82827 -0.82827 -0.035266 0.03527
3 lwd -0.82827 2.97495 0.035266 -0.12760
4 AGE -0.03527 0.03527 0.001571 -0.00157
5 lwdAGE 0.03527 -0.12760 -0.001571 0.00573
page 79 Table 3.16 Estimated odds ratios and 95% confidence intervals for LWD, controlling for AGE.
proc genmod data=lowbwt32 descending;
model low = lwd age lwd*age / dist=bin link=logit waldci;
estimate "age = 15" lwd 1 lwd*age 15 /exp;
estimate "age = 20" lwd 1 lwd*age 20 /exp;
estimate "age = 25" lwd 1 lwd*age 25 /exp;
estimate "age = 30" lwd 1 lwd*age 30 /exp;
run;
The GENMOD Procedure
Model Information
Data Set WORK.LOWBWT32 Predicted Values and
Diagnostic Statistics
Distribution Binomial
Link Function Logit
Dependent Variable LOW < 2500g
Observations Used 189
Probability Modeled Pr( LOW = 1 )
Response Profile
Ordered Ordered
Level Value Count
1 0 130
2 1 59
Parameter Information
Parameter Effect
Prm1 Intercept
Prm2 lwd
Prm3 AGE
Prm4 lwd*AGE
Criteria For Assessing Goodness Of Fit
Criterion DF Value Value/DF
Deviance 185 221.1399 1.1954
Scaled Deviance 185 221.1399 1.1954
Pearson Chi-Square 185 187.7843 1.0151
Scaled Pearson X2 185 187.7843 1.0151
Log Likelihood -110.5700
Algorithm converged.
Analysis Of Parameter Estimates
Standard Wald 95% Confidence Chi-
Parameter DF Estimate Error Limits Square Pr > ChiSq
Intercept 1 0.7745 0.9101 -1.0093 2.5583 0.72 0.3948
lwd 1 -1.9441 1.7248 -5.3246 1.4365 1.27 0.2597
AGE 1 -0.0796 0.0396 -0.1573 -0.0019 4.03 0.0447
The GENMOD Procedure
Analysis Of Parameter Estimates
Standard Wald 95% Confidence Chi-
Parameter DF Estimate Error Limits Square Pr > ChiSq
lwd*AGE 1 0.1322 0.0757 -0.0162 0.2806 3.05 0.0807
Scale 0 1.0000 0.0000 1.0000 1.0000
NOTE: The scale parameter was held fixed.
Contrast Estimate Results
Standard Chi-
Label Estimate Error Alpha Confidence Limits Square Pr > ChiSq
age = 15 0.0389 0.6604 0.05 -1.2555 1.3332 0.00 0.9531
Exp(age = 15) 1.0396 0.6866 0.05 0.2849 3.7933
age = 20 0.6998 0.4036 0.05 -0.0912 1.4909 3.01 0.0829
Exp(age = 20) 2.0134 0.8126 0.05 0.9128 4.4411
age = 25 1.3608 0.4197 0.05 0.5382 2.1835 10.51 0.0012
Exp(age = 25) 3.8994 1.6367 0.05 1.7129 8.8770
age = 30 2.0218 0.6899 0.05 0.6697 3.3740 8.59 0.0034
Exp(age = 30) 7.5520 5.2100 0.05 1.9536 29.1940
3.8 A comparison of logistic regression and stratified analysis of 2 x 2 tables
page 80 Table 3.17 Cross-classification of low birth weight by smoking status.
proc freq data=lowbwt32;
tables low*smoke;
run;
The FREQ Procedure
Table of LOW by SMOKE
LOW(< 2500g) SMOKE
Frequency|
Percent |
Row Pct |
Col Pct | 0| 1| Total
---------+--------+--------+
0 | 86 | 44 | 130
| 45.50 | 23.28 | 68.78
| 66.15 | 33.85 |
| 74.78 | 59.46 |
---------+--------+--------+
1 | 29 | 30 | 59
| 15.34 | 15.87 | 31.22
| 49.15 | 50.85 |
| 25.22 | 40.54 |
---------+--------+--------+
Total 115 74 189
60.85 39.15 100.00
page 81 Table 3.18 Cross-classification of low birth weight by smoking status stratified by RACE.
proc freq data=lowbwt32;
tables race*low*smoke;
run;
The FREQ Procedure
Table 1 of LOW by SMOKE
Controlling for RACE=1
LOW(< 2500g) SMOKE
Frequency|
Percent |
Row Pct |
Col Pct | 0| 1| Total
---------+--------+--------+
0 | 40 | 33 | 73
| 41.67 | 34.38 | 76.04
| 54.79 | 45.21 |
| 90.91 | 63.46 |
---------+--------+--------+
1 | 4 | 19 | 23
| 4.17 | 19.79 | 23.96
| 17.39 | 82.61 |
| 9.09 | 36.54 |
---------+--------+--------+
Total 44 52 96
45.83 54.17 100.00
Table 2 of LOW by SMOKE
Controlling for RACE=2
LOW(< 2500g) SMOKE
Frequency|
Percent |
Row Pct |
Col Pct | 0| 1| Total
---------+--------+--------+
0 | 11 | 4 | 15
| 42.31 | 15.38 | 57.69
| 73.33 | 26.67 |
| 68.75 | 40.00 |
---------+--------+--------+
1 | 5 | 6 | 11
| 19.23 | 23.08 | 42.31
| 45.45 | 54.55 |
| 31.25 | 60.00 |
---------+--------+--------+
Total 16 10 26
61.54 38.46 100.00
The FREQ Procedure
Table 3 of LOW by SMOKE
Controlling for RACE=3
LOW(< 2500g) SMOKE
Frequency|
Percent |
Row Pct |
Col Pct | 0| 1| Total
---------+--------+--------+
0 | 35 | 7 | 42
| 52.24 | 10.45 | 62.69
| 83.33 | 16.67 |
| 63.64 | 58.33 |
---------+--------+--------+
1 | 20 | 5 | 25
| 29.85 | 7.46 | 37.31
| 80.00 | 20.00 |
| 36.36 | 41.67 |
---------+--------+--------+
Total 55 12 67
82.09 17.91 100.00
page 82 Table 3.19 Tabulation of the estimated odds ratios, ln(estimated odds ratios), estimated variance of the ln(estimated odds ratios), and the inverse of the estimated variance, w, for smoking status within each stratum of RACE.
NOTE: You need to square the standard error given by SAS to the values on the third row of the table.
data lowbwt34;
set lowbwt31;
race2sm = race2*smoke;
race3sm = race3*smoke;
run;
proc genmod data=lowbwt34 descending;
model low = smoke race2 race3 race2sm race3sm / dist=bin link=logit waldci;
estimate 'White' smoke 1 /exp ;
estimate 'Black' smoke 1 race2sm 1 race3sm 0 / exp ;
estimate 'Other' smoke 1 race2sm 0 race3sm 1 / exp ;
run;
The GENMOD Procedure
Model Information
Data Set WORK.LOWBWT34
Distribution Binomial
Link Function Logit
Dependent Variable LOW < 2500g
Observations Used 189
Probability Modeled Pr( LOW = 1 )
Response Profile
Ordered Ordered
Level Value Count
1 0 130
2 1 59
Parameter Information
Parameter Effect
Prm1 Intercept
Prm2 SMOKE
Prm3 race2
Prm4 race3
Prm5 race2sm
Prm6 race3sm
Criteria For Assessing Goodness Of Fit
Criterion DF Value Value/DF
Deviance 183 216.8178 1.1848
Scaled Deviance 183 216.8178 1.1848
Pearson Chi-Square 183 188.9999 1.0328
Scaled Pearson X2 183 188.9999 1.0328
Log Likelihood -108.4089
Algorithm converged.
Analysis Of Parameter Estimates
Standard Wald 95% Confidence Chi-
Parameter DF Estimate Error Limits Square Pr > ChiSq
Intercept 1 -2.3026 0.5244 -3.3304 -1.2748 19.28 <.0001
SMOKE 1 1.7505 0.5983 0.5779 2.9231 8.56 0.0034
The GENMOD Procedure
Analysis Of Parameter Estimates
Standard Wald 95% Confidence Chi-
Parameter DF Estimate Error Limits Square Pr > ChiSq
race2 1 1.5141 0.7523 0.0397 2.9885 4.05 0.0441
race3 1 1.7430 0.5946 0.5775 2.9084 8.59 0.0034
race2sm 1 -0.5566 1.0322 -2.5797 1.4666 0.29 0.5897
race3sm 1 -1.5274 0.8828 -3.2577 0.2029 2.99 0.0836
Scale 0 1.0000 0.0000 1.0000 1.0000
NOTE: The scale parameter was held fixed.
Contrast Estimate Results
Standard Chi-
Label Estimate Error Alpha Confidence Limits Square Pr > ChiSq
White 1.7505 0.5983 0.05 0.5779 2.9231 8.56 0.0034
Exp(White) 5.7576 3.4446 0.05 1.7823 18.5991
Black 1.1939 0.8412 0.05 -0.4548 2.8426 2.01 0.1558
Exp(Black) 3.3000 2.7759 0.05 0.6346 17.1602
Other 0.2231 0.6492 0.05 -1.0492 1.4955 0.12 0.7310
Exp(Other) 1.2500 0.8115 0.05 0.3502 4.4616
page 84 Table 3.20 Estimated logistic regression coefficients for the variable SMOKE, log-likelihood, the likelihood ratio test statistic (G), and the resulting p-value for estimation of the stratified odds ratio and assessment of homogeneity of odds ratios across strata defined by RACE.
NOTE: SAS give the -2 log likelihood while the text gives the log likelihood. Therefore, you need to divide the value given by SAS by -2 (don't forget to use the -2 log likelihood for both the intercept and the covariates. To get the values of G, you need to subtract the -2 log likelihoods.
proc logistic data=lowbwt34 desc;
model low = smoke / clparm=wald;
run;
The LOGISTIC Procedure
Model Information
Data Set WORK.LOWBWT34
Response Variable LOW < 2500g
Number of Response Levels 2
Number of Observations 189
Link Function Logit
Optimization Technique Fisher's scoring
Response Profile
Ordered Total
Value LOW Frequency
1 1 59
2 0 130
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics
Intercept
Intercept and
Criterion Only Covariates
AIC 236.672 233.805
SC 239.914 240.288
-2 Log L 234.672 229.805
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 4.8674 1 0.0274
Score 4.9237 1 0.0265
Wald 4.8516 1 0.0276
Analysis of Maximum Likelihood Estimates
Standard
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 -1.0870 0.2147 25.6244 <.0001
SMOKE 1 0.7040 0.3196 4.8516 0.0276
The LOGISTIC Procedure
Odds Ratio Estimates
Point 95% Wald
Effect Estimate Confidence Limits
SMOKE 2.022 1.081 3.783
Association of Predicted Probabilities and Observed Responses
Percent Concordant 33.6 Somers' D 0.170
Percent Discordant 16.6 Gamma 0.338
Percent Tied 49.7 Tau-a 0.073
Pairs 7670 c 0.585
Wald Confidence Interval for Parameters
Parameter Estimate 95% Confidence Limits
Intercept -1.0870 -1.5078 -0.6661
SMOKE 0.7040 0.0776 1.3305
proc logistic data=lowbwt34 desc;
model low = smoke race2 race3 / clparm=wald;
run;
The LOGISTIC Procedure
Model Information
Data Set WORK.LOWBWT34
Response Variable LOW < 2500g
Number of Response Levels 2
Number of Observations 189
Link Function Logit
Optimization Technique Fisher's scoring
Response Profile
Ordered Total
Value LOW Frequency
1 1 59
2 0 130
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics
Intercept
Intercept and
Criterion Only Covariates
AIC 236.672 227.975
SC 239.914 240.942
-2 Log L 234.672 219.975
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 14.6973 3 0.0021
Score 14.1265 3 0.0027
Wald 12.8812 3 0.0049
The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Standard
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 -1.8405 0.3529 27.2065 <.0001
SMOKE 1 1.1160 0.3692 9.1357 0.0025
race2 1 1.0841 0.4900 4.8951 0.0269
race3 1 1.1086 0.4003 7.6689 0.0056
Odds Ratio Estimates
Point 95% Wald
Effect Estimate Confidence Limits
SMOKE 3.053 1.480 6.294
race2 2.957 1.132 7.725
race3 3.030 1.383 6.640
Association of Predicted Probabilities and Observed Responses
Percent Concordant 54.5 Somers' D 0.299
Percent Discordant 24.6 Gamma 0.378
Percent Tied 20.9 Tau-a 0.129
Pairs 7670 c 0.650
Wald Confidence Interval for Parameters
Parameter Estimate 95% Confidence Limits
Intercept -1.8405 -2.5321 -1.1489
SMOKE 1.1160 0.3923 1.8397
race2 1.0841 0.1237 2.0444
race3 1.1086 0.3240 1.8931
proc logistic data=lowbwt34 desc;
model low = smoke race2 race3 race2sm race3sm / clparm=wald;
run;
quit;
The LOGISTIC Procedure
Model Information
Data Set WORK.LOWBWT34
Response Variable LOW < 2500g
Number of Response Levels 2
Number of Observations 189
Link Function Logit
Optimization Technique Fisher's scoring
Response Profile
Ordered Total
Value LOW Frequency
1 1 59
2 0 130
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics
Intercept
Intercept and
Criterion Only Covariates
AIC 236.672 228.818
SC 239.914 248.268
-2 Log L 234.672 216.818
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 17.8542 5 0.0031
Score 15.8649 5 0.0072
Wald 13.1634 5 0.0219
The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Standard
Parameter DF Estimate Error Chi-Square Pr > ChiSq
>Intercept 1 -2.3026 0.5244 19.2796 <.0001
SMOKE 1 1.7505 0.5983 8.5611 0.0034
race2 1 1.5141 0.7523 4.0511 0.0441
race3 1 1.7430 0.5946 8.5921 0.0034
race2sm 1 -0.5566 1.0322 0.2907 0.5897
race3sm 1 -1.5274 0.8828 2.9933 0.0836
Odds Ratio Estimates
Point 95% Wald
Effect Estimate Confidence Limits
SMOKE 5.758 1.782 18.599
race2 4.545 1.041 19.857
race3 5.714 1.782 18.327
race2sm 0.573 0.076 4.334
race3sm 0.217 0.038 1.225
Association of Predicted Probabilities and Observed Responses
Percent Concordant 54.8 Somers' D 0.305
Percent Discordant 24.3 Gamma 0.386
Percent Tied 20.9 Tau-a 0.132
Pairs 7670 c 0.653
Wald Confidence Interval for Parameters
Parameter Estimate 95% Confidence Limits
Intercept -2.3026 -3.3304 -1.2748
SMOKE 1.7505 0.5779 2.9231
race2 1.5141 0.0397 2.9885
race3 1.7430 0.5775 2.9084
race2sm -0.5566 -2.5797 1.4666
race3sm -1.5274 -3.2577 0.2029
3.9 Interpretation of the fitted values
page 86 Figure 3.4 Graph of the estimated logit of low birth weight and 95 percent confidence intervals as a function of weight at the last menstrual period for white women.
proc logistic data=lowbwt34 desc;
model low = lwt race2 race3;
output out=lowbwt35 xbeta=p stdxbeta=sepl;
run;
The LOGISTIC Procedure
Model Information
Data Set WORK.LOWBWT34
Response Variable LOW < 2500g
Number of Response Levels 2
Number of Observations 189
Link Function Logit
Optimization Technique Fisher's scoring
Response Profile
Ordered Total
Value LOW Frequency
1 1 59
2 0 130
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics
Intercept
Intercept and
Criterion Only Covariates
AIC 236.672 231.259
SC 239.914 244.226
-2 Log L 234.672 223.259
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 11.4129 3 0.0097
Score 10.7572 3 0.0131
Wald 10.1316 3 0.0175
The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Standard
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 0.8057 0.8452 0.9088 0.3404
LWT 1 -0.0152 0.00644 5.5886 0.0181
race2 1 1.0811 0.4881 4.9065 0.0268
race3 1 0.4806 0.3567 1.8156 0.1778
Odds Ratio Estimates
Point 95% Wald
Effect Estimate Confidence Limits
LWT 0.985 0.973 0.997
race2 2.948 1.133 7.672
race3 1.617 0.804 3.253
Association of Predicted Probabilities and Observed Responses
Percent Concordant 64.1 Somers' D 0.293
Percent Discordant 34.8 Gamma 0.296
Percent Tied 1.1 Tau-a 0.127
Pairs 7670 c 0.647
data lowbwt36;
set lowbwt35;
lower = p -1.96*sepl;
upper = p+ 1.96*sepl;
run;
proc sort data=lowbwt36;
by lwt;
run;
symbol1 i=join value=none;
proc gplot data=lowbwt36;
plot p*lwt upper*lwt lower*lwt / overlay;
where race = 1;
run;
quit;
page 87 Figure 3.5 Graph of the estimated probability of low weight birth and 95 percent confidence intervals as a function of weight at the last menstrual period for white women.
proc logistic data=lowbwt34 desc;
model low = lwt race2 race3;
output out=lowbwt37 p=p u=u l=l;
run;
The LOGISTIC Procedure
Model Information
Data Set WORK.LOWBWT34
Response Variable LOW < 2500g
Number of Response Levels 2
Number of Observations 189
Link Function Logit
Optimization Technique Fisher's scoring
Response Profile
Ordered Total
Value LOW Frequency
1 1 59
2 0 130
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics
Intercept
Intercept and
Criterion Only Covariates
AIC 236.672 231.259
SC 239.914 244.226
-2 Log L 234.672 223.259
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 11.4129 3 0.0097
Score 10.7572 3 0.0131
Wald 10.1316 3 0.0175
The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Standard
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 0.8057 0.8452 0.9088 0.3404
LWT 1 -0.0152 0.00644 5.5886 0.0181
race2 1 1.0811 0.4881 4.9065 0.0268
race3 1 0.4806 0.3567 1.8156 0.1778
Odds Ratio Estimates
Point 95% Wald
Effect Estimate Confidence Limits
LWT 0.985 0.973 0.997
race2 2.948 1.133 7.672
race3 1.617 0.804 3.253
Association of Predicted Probabilities and Observed Responses
Percent Concordant 64.1 Somers' D 0.293
Percent Discordant 34.8 Gamma 0.296
Percent Tied 1.1 Tau-a 0.127
Pairs 7670 c 0.647
proc sort data=lowbwt37;
by lwt;
run;
symbol1 i=join value=none;
proc gplot data=lowbwt37;
plot p*lwt u*lwt l*lwt / overlay;
where race = 1;
run;
quit;
