Specifying contrasts in logistic regression can be tricky. The examples below will illustrate how to write contrast statements in proc logistic for increasingly complicated models. All of our examples will use the logit_sim dataset, which is a simulated dataset created specifically for this page.
Example 1
For our first example, we will use a simple model that has two categorical predictor variables, x1 and x2. Both x1 and x2 have three levels, and for both variables, the reference level will be set to 1. In this example, we will look at the comparisons of the three levels of x2 at a single level of x1. We will use the param option with the ref keyword on the class statement. This changes the type of coding to be “reference” or “dummy” coding. We also indicate on the class statement that the reference level for both x1 and x2 will be level 1. Because the reference level of x1 is 1, the contrasts for x2 are most simply done at that level. Also, the contrast coefficients for x2 do not necessarily have to sum to 0, because this is regression and not ANOVA. In other words, because the reference level is not included in the model, the contrast coefficients do not need to sum to 0. Hence, because the variable x2 has three levels, only two contrast coefficients are needed. By default, if fewer values are specified than are needed, SAS will set them to 0. (If more contrast coefficients are specified than are needed, SAS will ignore the extra ones.) We should note that the contrasts written for proc logistic are somewhat different from those written for proc glm. This is because of the different ways in which regression models and ANOVA models are parameterized.
We have used two options on the contrast statement. The first, e requests the L matrix be displayed. This allows the user to see that contrast matrix used is the same as the one desired. The estimate option with the parm keyword is used to request that the contrast itself be estimated. In this example, the estimate of the contrast should match the regression coefficients given in the output. This also allows us to check that we have specified the contrast coding that we intended. Finally, while the label of the contrast (given in quotes) is technically not necessary, it is really important. These labels help you (and others) to read the SAS code as well as the output.
proc logistic data = logit_sim desc; class x1 (ref = '1') x2 (ref = '1') / param=ref; model y = x1 x2; contrast 'x1 = 1 x2 1 v 2' x2 1 /e estimate = parm; contrast 'x1 = 1 x2 1 v 3' x2 0 1 /e estimate = parm; contrast 'x1 = 1 x2 2 v 3' x2 1 -1 /e estimate = parm; run;The LOGISTIC Procedure Model Information Data Set WORK.LOGIT_SIM Response Variable y Number of Response Levels 2 Model binary logit Optimization Technique Fisher's scoring Number of Observations Read 900 Number of Observations Used 900 Response Profile Ordered Total Value y Frequency 1 1 491 2 0 409 Probability modeled is y=1. Class Level Information Design Class Value Variables x1 1 0 0 2 1 0 3 0 1 x2 1 0 0 2 1 0 3 0 1 Model Convergence Status Convergence criterion (GCONV=1E-8) satisfied. Model Fit Statistics Intercept Intercept and Criterion Only Covariates AIC 1242.183 1150.488 SC 1246.986 1174.500 -2 Log L 1240.183 1140.488 Testing Global Null Hypothesis: BETA=0 Test Chi-Square DF Pr > ChiSq Likelihood Ratio 99.6952 4 <.0001 Score 96.3913 4 <.0001 Wald 88.8006 4 <.0001 Type 3 Analysis of Effects Wald Effect DF Chi-Square Pr > ChiSq x1 2 24.0082 <.0001 x2 2 72.3300 <.0001 Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept 1 0.1547 0.1535 1.0154 0.3136 x1 2 1 0.7680 0.1760 19.0459 <.0001 x1 3 1 0.0291 0.1707 0.0291 0.8645 x2 2 1 -1.0564 0.1719 37.7527 <.0001 x2 3 1 0.3892 0.1737 5.0204 0.0251 Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits x1 2 vs 1 2.155 1.527 3.043 x1 3 vs 1 1.030 0.737 1.439 x2 2 vs 1 0.348 0.248 0.487 x2 3 vs 1 1.476 1.050 2.074 Association of Predicted Probabilities and Observed Responses Percent Concordant 63.1 Somers' D 0.361 Percent Discordant 27.1 Gamma 0.400 Percent Tied 9.8 Tau-a 0.179 Pairs 200819 c 0.680 Coefficients of Contrast x1 = 1 x2 1 v 2 Parameter Row1 Intercept 0 x12 0 x13 0 x22 1 x23 0 Coefficients of Contrast x1 = 1 x2 1 v 3 Parameter Row1 Intercept 0 x12 0 x13 0 x22 0 x23 1 Coefficients of Contrast x1 = 1 x2 2 v 3 Parameter Row1 Intercept 0 x12 0 x13 0 x22 1 x23 -1 Contrast Test Results Wald Contrast DF Chi-Square Pr > ChiSq x1 = 1 x2 1 v 2 1 37.7527 <.0001 x1 = 1 x2 1 v 3 1 5.0204 0.0251 x1 = 1 x2 2 v 3 1 66.8458 <.0001 Contrast Rows Estimation and Testing Results Standard Wald Contrast Type Row Estimate Error Alpha Confidence Limits Chi-Square Pr > ChiSq x1 = 1 x2 1 v 2 PARM 1 -1.0564 0.1719 0.05 -1.3934 -0.7194 37.7527 <.0001 x1 = 1 x2 1 v 3 PARM 1 0.3892 0.1737 0.05 0.0488 0.7297 5.0204 0.0251 x1 = 1 x2 2 v 3 PARM 1 -1.4456 0.1768 0.05 -1.7922 -1.0991 66.8458 <.0001
Because the output from proc logistic is so long, we will show it in its entirety only once. We have bolded some parts of the output to call attention to them. First, the table “Type III Analysis Effects” shows the results for the two degree-of-freedom tests of x1 and x2. Both variables are statistically significant. In the next table, “Analysis of Maximum Likelihood Estimates”, we see the coefficients for each term in the model. Because both x1 and x2 are statistically significant, we can interpret the point estimates for the dummy variables that represent x1 and x2 in the model. For our purposes, two other parts of the output deserve mention. First, the tables labeled “Coefficients of Contrasts” echo the contrasts that we specified, but they also include all of the 0s that we did not include on our contrast statements (and which SAS filled in). These tell us that we communicated clearly with SAS and that the contrasts that we wanted were indeed run. Please remember that sometimes a contrast will run without error, but it is not necessarily the contrast that you intended to specify.
Looking at the values that we specified on the contrast statement, for the “x1 = 1 x2 1 v 2”, we specified only 1 on the contrast statement, and in the “Coefficients of Contrasts” for “x1 = 1 x2 1 v 2”, we see a 1 for x22 and 0s for all other variables. For this contrast, it is important to note the 0 for x23.
Because we did the contrasts for x2 at the reference level of x1, the estimates of the contrasts (and their p-values), given toward the bottom of the output in the “Contrast Rows Estimation and Testing Results” should match the estimates (and p-values) given in the “Analysis of Maximum Likelihood Estimates” table. We can see that in both tables, the estimate for “x1 = 1 x2 1 v 2” is -1.0564 (with a p-value of <.000), and the estimate for “x1 = 1 x2 1 v 3” is 0.3892 (with a p-value of 0.0251). While this may seem like a lot of work to get the same point estimates, notice that we were able to easily specify the contrast for x2 level 2 versus level 3 without having to change the reference group.
Example 2
In this example, we will add the interaction between x1 and x2. Because we have included the interaction of x1 and x2 in the model, we must also include it in the contrasts. Because the reference level of x1 is 1, it and the interaction are 0 and therefore are not needed on the contrast statement for contrasts done at level 1 of x1. Hence, just the contrast coefficients for the variable x2 are needed.
In the example below, we test the effect of x2 at each level of x1. Because the interaction between x1 and x2 is included in the model, it must also be accounted for in the contrast statements. For the contrast labeled x1 = 2, x2 1 v 2, the interaction is not included because the reference level for x1 is 1. The first line tests x2 level 1 versus level 2, and the second line tests x2 level 1 versus level 3. The second contrast statement tests the effect of x2 at the second level of x1. The contrast coding for the variable x2 is the same as in the first contrast statement; the first line tests level 1 versus level 2, and the second line tests level 1 versus level 3. With regard to the interaction, the coding is the same. In the third contrast statement, we see that some 0s have been added to the coding of the interaction term. The two 0s on the first and second lines are for level 2 of x1. The rest of the contrast statements are constructed in a similar manner.
proc logistic data = logit_sim desc; class x1 (ref = '1') x2 (ref = '1') / param=ref; model y = x1 x2 x1*x2; contrast 'x1 = 1 x2 1 v 2' x2 1 /estimate = parm; contrast 'x1 = 1 x2 1 v 3' x2 0 1 /estimate = parm; contrast 'x1 = 1 x2 2 v 3' x2 1 -1 /estimate = parm; contrast 'x1 = 2 x2 1 v 2' x2 1 x1*x2 1 /estimate = parm; contrast 'x1 = 2 x2 1 v 3' x2 0 1 x1*x2 0 1 /estimate = parm; contrast 'x1 = 2 x2 2 v 3' x2 1 -1 x1*x2 1 -1 /estimate = parm; contrast 'x1 = 3 x2 1 v 2' x2 1 x1*x2 0 0 1 /estimate = parm; contrast 'x1 = 3 x2 1 v 3' x2 0 1 x1*x2 0 0 0 1 /estimate = parm; contrast 'x1 = 3 x2 2 v 3' x2 1 -1 x1*x2 0 0 1 -1 /estimate = parm; run;
< some output omitted >
Type 3 Analysis of Effects Wald Effect DF Chi-Square Pr > ChiSq x1 2 7.4563 0.0240 x2 2 23.0959 <.0001 x1*x2 4 10.4166 0.0340 Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept 1 0.2007 0.2010 0.9967 0.3181 x1 2 1 0.6947 0.2983 5.4245 0.0199 x1 3 1 -0.0403 0.2840 0.0202 0.8871 x2 2 1 -1.0961 0.2983 13.5025 0.0002 x2 3 1 0.2889 0.2878 1.0073 0.3156 x1*x2 2 2 1 0.3610 0.4217 0.7330 0.3919 x1*x2 2 3 1 -0.2398 0.4255 0.3177 0.5730 x1*x2 3 2 1 -0.3299 0.4330 0.5806 0.4461 x1*x2 3 3 1 0.4952 0.4156 1.4201 0.2334 Association of Predicted Probabilities and Observed Responses Percent Concordant 61.3 Somers' D 0.369 Percent Discordant 24.4 Gamma 0.431 Percent Tied 14.3 Tau-a 0.183 Pairs 200819 c 0.685 Contrast Test Results Wald Contrast DF Chi-Square Pr > ChiSq x1 = 1 x2 1 v 2 1 13.5025 0.0002 x1 = 1 x2 1 v 3 1 1.0073 0.3156 x1 = 1 x2 2 v 3 1 21.0745 <.0001 x1 = 2 x2 1 v 2 1 6.0826 0.0137 x1 = 2 x2 1 v 3 1 0.0245 0.8755 x1 = 2 x2 2 v 3 1 6.8422 0.0089 x1 = 3 x2 1 v 2 1 20.6377 <.0001 x1 = 3 x2 1 v 3 1 6.8422 0.0089 x1 = 3 x2 2 v 3 1 45.2793 <.0001 Contrast Rows Estimation and Testing Results Standard Wald Contrast Type Row Estimate Error Alpha Confidence Limits Chi-Square Pr > ChiSq x1 = 1 x2 1 v 2 PARM 1 -1.0961 0.2983 0.05 -1.6807 -0.5114 13.5025 0.0002 x1 = 1 x2 1 v 3 PARM 1 0.2889 0.2878 0.05 -0.2753 0.8530 1.0073 0.3156 x1 = 1 x2 2 v 3 PARM 1 -1.3849 0.3017 0.05 -1.9762 -0.7936 21.0745 <.0001 x1 = 2 x2 1 v 2 PARM 1 -0.7350 0.2980 0.05 -1.3192 -0.1509 6.0826 0.0137 x1 = 2 x2 1 v 3 PARM 1 0.0491 0.3133 0.05 -0.5650 0.6632 0.0245 0.8755 x1 = 2 x2 2 v 3 PARM 1 -0.7841 0.2998 0.05 -1.3717 -0.1966 6.8422 0.0089 x1 = 3 x2 1 v 2 PARM 1 -1.4260 0.3139 0.05 -2.0412 -0.8108 20.6377 <.0001 x1 = 3 x2 1 v 3 PARM 1 0.7841 0.2998 0.05 0.1966 1.3717 6.8422 0.0089 x1 = 3 x2 2 v 3 PARM 1 -2.2101 0.3284 0.05 -2.8539 -1.5664 45.2793 <.0001
In the table “Contrast Test Results”, we can see that all but two of the contrasts are statistically significant at the .05 level.
Example 3
In the example below, we will add some new contrast statements, and we will use the Output Delivery System (ODS) output them to a dataset called test1. The output from proc print shows us the contents of test1. We have also added some covariates to the model (c1 and c2). The covariates are held at 0 for the contrasts. In many cases, holding the covariate at 0 does not make substantive sense, and centering the covariate may be more useful.
In addition to the contrasts that were used in the previous examples, we have added three contrast statements test the effect of x2 at each level of x1. Because x2 is a three level variable, its test has two degrees of freedom. This is indicated to SAS by the use of a comma between the two tests of x2 (level 1 versus level 2, and level 1 versus level 3). Because the reference level of x1 is 1, the interaction is 0 and hence does not need to be included on the contrast statement.
ods output ContrastEstimate = test1; proc logistic data = logit_sim desc; class x1 (ref = '1') x2 (ref = '1') / param=ref; model y = x1 x2 x1*x2 c1 c2; contrast 'x1 = 1 x2 1 v 2' x2 1 /estimate = parm; contrast 'x1 = 1 x2 1 v 3' x2 0 1 /estimate = parm ; contrast 'x1 = 1 x2 2 v 3' x2 1 -1 /estimate = parm; contrast 'x1 = 2 x2 1 v 2' x2 1 x1*x2 1 /estimate = parm; contrast 'x1 = 2 x2 1 v 3' x2 0 1 x1*x2 0 1 /estimate = parm; contrast 'x1 = 2 x2 2 v 3' x2 1 -1 x1*x2 1 -1 /estimate = parm; contrast 'effect of x2 at x1 = 1' x2 1, x2 0 1; contrast 'effect of x2 at x1 = 2' x2 1 x1*x2 1, x2 0 1 x1*x2 0 1; contrast 'effect of x2 at x1 = 3' x2 1 x1*x2 0 0 1 0, x2 0 1 x1*x2 0 0 1 0; run;
< some output omitted >
Contrast Test Results Wald Contrast DF Chi-Square Pr > ChiSq x1 = 1 x2 1 v 2 1 13.3359 0.0003 x1 = 1 x2 1 v 3 1 1.1020 0.2938 x1 = 1 x2 2 v 3 1 21.2294 <.0001 x1 = 2 x2 1 v 2 1 5.8453 0.0156 x1 = 2 x2 1 v 3 1 0.0107 0.9177 x1 = 2 x2 2 v 3 1 6.2793 0.0122 effect of x2 at x1 = 1 2 23.1540 <.0001 effect of x2 at x1 = 2 2 8.3588 0.0153 effect of x2 at x1 = 3 2 45.1142 <.0001 Contrast Rows Estimation and Testing Results Standard Contrast Type Row Estimate Error Alpha Confidence Limits x1 = 1 x2 1 v 2 PARM 1 -1.0904 0.2986 0.05 -1.6756 -0.5052 x1 = 1 x2 1 v 3 PARM 1 0.3029 0.2886 0.05 -0.2626 0.8685 x1 = 1 x2 2 v 3 PARM 1 -1.3933 0.3024 0.05 -1.9860 -0.8006 x1 = 2 x2 1 v 2 PARM 1 -0.7217 0.2985 0.05 -1.3067 -0.1366 x1 = 2 x2 1 v 3 PARM 1 0.0324 0.3140 0.05 -0.5830 0.6479 x1 = 2 x2 2 v 3 PARM 1 -0.7541 0.3010 0.05 -1.3440 -0.1643Contrast Rows Estimation and Testing Results Wald Contrast Type Row Chi-Square Pr > ChiSq x1 = 1 x2 1 v 2 PARM 1 13.3359 0.0003 x1 = 1 x2 1 v 3 PARM 1 1.1020 0.2938 x1 = 1 x2 2 v 3 PARM 1 21.2294 <.0001 x1 = 2 x2 1 v 2 PARM 1 5.8453 0.0156 x1 = 2 x2 1 v 3 PARM 1 0.0107 0.9177 x1 = 2 x2 2 v 3 PARM 1 6.2793 0.0122proc print data = test1 noobs; run;Lower Upper Prob Contrast Type Row Estimate StdErr Alpha Limit Limit WaldChiSq ChiSq x1 = 1 x2 1 v 2 PARM 1 -1.0904 0.2986 0.05 -1.6756 -0.5052 13.3359 0.0003 x1 = 1 x2 1 v 3 PARM 1 0.3029 0.2886 0.05 -0.2626 0.8685 1.1020 0.2938 x1 = 1 x2 2 v 3 PARM 1 -1.3933 0.3024 0.05 -1.9860 -0.8006 21.2294 <.0001 x1 = 2 x2 1 v 2 PARM 1 -0.7217 0.2985 0.05 -1.3067 -0.1366 5.8453 0.0156 x1 = 2 x2 1 v 3 PARM 1 0.0324 0.3140 0.05 -0.5830 0.6479 0.0107 0.9177 x1 = 2 x2 2 v 3 PARM 1 -0.7541 0.3010 0.05 -1.3440 -0.1643 6.2793 0.0122 effect of x2 at x1 = 1 PARM 1 -1.3933 0.3024 0.05 -1.9860 -0.8006 21.2294 <.0001 effect of x2 at x1 = 1 PARM 2 -1.0904 0.2986 0.05 -1.6756 -0.5052 13.3359 0.0003 effect of x2 at x1 = 2 PARM 1 -0.7541 0.3010 0.05 -1.3440 -0.1643 6.2793 0.0122 effect of x2 at x1 = 2 PARM 2 -0.7217 0.2985 0.05 -1.3067 -0.1366 5.8453 0.0156 effect of x2 at x1 = 3 PARM 1 -2.1905 0.3290 0.05 -2.8353 -1.5456 44.3222 <.0001 effect of x2 at x1 = 3 PARM 2 -1.4171 0.3151 0.05 -2.0347 -0.7995 20.2250 <.0001