On this page, we show two examples on using proc logistic for conditional logit models. For conditional logit model, proc logistic is very easy to use and it handles all kinds of matching, 1-1, 1-M matching, and in fact M-N matching.
Example 1: 1-1 Matching
This example is adapted from Chapter 7 of Applied Logistic Regression by Hosmer & Lemeshow (2000).You can download the SAS data here.
The first 20 observations are listed below. Notice that variable pairid indicates that the observations are paired.
pairid lbwt age lastwt race smoke ptd ht ui race1 race2 race3 1 0 14 135 1 0 0 0 0 1 0 0 1 1 14 101 3 1 1 0 0 0 0 1 2 0 15 98 2 0 0 0 0 0 1 0 2 1 15 115 3 0 0 0 1 0 0 1 3 0 16 95 3 0 0 0 0 0 0 1 3 1 16 130 3 0 0 0 0 0 0 1 4 0 17 103 3 0 0 0 0 0 0 1 4 1 17 130 3 1 1 0 1 0 0 1 5 0 17 122 1 1 0 0 0 1 0 0 5 1 17 110 1 1 0 0 0 1 0 0 6 0 17 113 2 0 0 0 0 0 1 0 6 1 17 120 1 1 0 0 0 1 0 0 7 0 17 113 2 0 0 0 0 0 1 0 7 1 17 120 2 0 0 0 0 0 1 0 8 0 17 119 3 0 0 0 0 0 0 1 8 1 17 142 2 0 0 1 0 0 1 0 9 0 18 100 1 1 0 0 0 1 0 0 9 1 18 148 3 0 0 0 0 0 0 1 10 0 18 90 1 1 0 0 1 1 0 0 10 1 18 110 2 1 1 0 0 0 1 0proc logistic data = lbwt11 descending; model lbwt = lastwt smoke race2 race3 ptd ht ui ; strata pairid; run;
The LOGISTIC Procedure Conditional Analysis Model Information Data Set ATS.LBWT11 Response Variable lbwt Number of Response Levels 2 Number of Strata 56 Model binary logit Optimization Technique Newton-Raphson ridge Model Information low brth wt < 2500g Number of Observations Read 112 Number of Observations Used 112 Response Profile Ordered Total Value lbwt Frequency 1 1 56 2 0 56 Probability modeled is lbwt=1. Strata Summary lbwt Response ------ Number of Pattern 1 0 Strata Frequency 1 1 1 56 112 Newton-Raphson Ridge Optimization Without Parameter Scaling Convergence criterion (GCONV=1E-8) satisfied. Model Fit Statistics Without With Criterion Covariates Covariates AIC 77.632 65.589 SC 77.632 84.618 -2 Log L 77.632 51.589 Testing Global Null Hypothesis: BETA=0 Test Chi-Square DF Pr > ChiSq Likelihood Ratio 26.0439 7 0.0005 Score 20.2669 7 0.0050 Wald 12.7208 7 0.0792 Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq lastwt 1 -0.0184 0.0101 3.3229 0.0683 smoke 1 1.4007 0.6278 4.9770 0.0257 race2 1 0.5714 0.6896 0.6864 0.4074 race3 1 -0.0253 0.6992 0.0013 0.9711 ptd 1 1.8080 0.7887 5.2557 0.0219 ht 1 2.3612 1.0861 4.7259 0.0297 ui 1 1.4019 0.6962 4.0554 0.0440 Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits lastwt 0.982 0.963 1.001 smoke 4.058 1.185 13.890 race2 1.771 0.458 6.842 race3 0.975 0.248 3.839 ptd 6.098 1.300 28.609 ht 10.603 1.262 89.115 ui 4.063 1.038 15.901
Example 2: 1-M matching
This example is adapted from Chapter 7 of Applied Logistic Regression by Hosmer & Lemeshow (2000). You can download the SAS data file https://stats.idre.ucla.edu/wp-content/uploads/2016/02/bbdm13-1.sas7bdat here.
The first 20 observations are listed below. Notice that variable str indicates that there are four choices for each subject.
str obs fndx chk agmn wt mod wid nvmr 1 1 1 1 13 118 55 0 0 1 2 0 2 11 175 1 0 0 1 3 0 2 12 135 1 0 0 1 4 0 1 11 125 55 0 0 2 1 1 1 14 118 55 0 0 2 2 0 2 15 183 55 0 0 2 3 0 2 11 218 55 0 0 2 4 0 1 13 192 55 0 0 3 1 1 1 15 125 55 0 0 3 2 0 2 14 123 55 0 0 3 3 0 1 13 140 55 0 0 3 4 0 1 13 160 55 0 0 4 1 1 1 14 150 55 0 1 4 2 0 1 13 130 1 0 0 4 3 0 2 14 140 55 0 0 4 4 0 1 16 130 55 0 0 5 1 1 1 17 150 1 0 0 5 2 0 2 12 148 55 0 0 5 3 0 1 13 134 55 0 0 5 4 0 1 14 138 55 1 0proc logistic data = bbdm13 descending; model fndx = chk agmn wt mod wid nvmr ; strata str; run;
The LOGISTIC ProcedureConditional AnalysisModel InformationData Set ATS.BBDM13 Response Variable fndx Number of Response Levels 2 Number of Strata 50 Model binary logit Optimization Technique Newton-Raphson ridgeModel Informationfinal diagnosisNumber of Observations Read 200 Number of Observations Used 200Response ProfileOrdered Total Value fndx Frequency1 1 50 2 0 150Probability modeled is fndx=1.Strata Summaryfndx Response ------ Number of Pattern 1 0 Strata Frequency1 1 3 50 200Newton-Raphson Ridge OptimizationWithout Parameter ScalingConvergence criterion (GCONV=1E-8) satisfied. Model Fit StatisticsWithout With Criterion Covariates CovariatesAIC 138.629 102.430 SC 138.629 122.220 -2 Log L 138.629 90.430Testing Global Null Hypothesis: BETA=0Test Chi-Square DF Pr > ChiSqLikelihood Ratio 48.1998 6 <.0001 Score 39.9247 6 <.0001 Wald 25.2218 6 0.0003Analysis of Maximum Likelihood EstimatesStandard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSqchk 1 -1.1218 0.4474 6.2862 0.0122 agmn 1 0.3561 0.1292 7.6013 0.0058 wt 1 -0.0284 0.00998 8.0771 0.0045 mod 1 0.00376 0.0120 0.0984 0.7538 wid 1 -0.4916 0.8173 0.3618 0.5475 nvmr 1 1.4722 0.7582 3.7701 0.0522Odds Ratio EstimatesPoint 95% Wald Effect Estimate Confidence Limitschk 0.326 0.135 0.783 agmn 1.428 1.108 1.839 wt 0.972 0.953 0.991 mod 1.004 0.980 1.028 wid 0.612 0.123 3.035 nvmr 4.359 0.986 19.264