Inputting the Aspirin data, Table 2.3, p. 20 and calculating the results on p. 21-24.
Note: For 2×2 tables the measure option in the freq procedure provides the confidence intervals for the odds ratio, which is labeled "case-control (odds ratio)" in the output, and the relative risk, which is labeled "cohort (col1 risk)". The proc freq and proc genmod were invoked to show that both procedures can produce a chi-squared test statistic though only the proc freq will perform a chi-square test and give a p-value.
data aspirin; input group mi count @@; cards; 1 1 189 1 2 10845 2 1 104 2 2 10933 ; run; proc format; value group 1='Placebo' 2='Aspirin'; value mi 1='Yes' 2='No'; run; proc freq data=aspirin order=data; format group group. mi mi.; weight count; tables group*mi / chisq expected measures nopercent norow nocol; run; proc genmod data=aspirin; format group group. mi mi.; model count = group mi / dist=poi link=log obstats residuals; run;
The FREQ ProcedureTable of group by mi group mi
Frequency| Expected |Yes |No | Total ———+——–+——–+ Placebo | 189 | 10845 | 11034 | 146.48 | 10888 | ———+——–+——–+ Aspirin | 104 | 10933 | 11037 | 146.52 | 10890 | ———+——–+——–+ Total 293 21778 22071
Statistics for Table of group by mi
Statistic DF Value Prob —————————————————— Chi-Square 1 25.0139 <.0001 Likelihood Ratio Chi-Square 1 25.3720 <.0001 Continuity Adj. Chi-Square 1 24.4291 <.0001 Mantel-Haenszel Chi-Square 1 25.0128 <.0001 Phi Coefficient 0.0337 Contingency Coefficient 0.0336 Cramer’s V 0.0337
Fisher’s Exact Test ———————————- Cell (1,1) Frequency (F) 189 Left-sided Pr <= F 1.0000 Right-sided Pr >= F 3.253E-07
Table Probability (P) 1.516E-07 Two-sided Pr <= P 5.033E-07
The FREQ Procedure
Statistics for Table of group by mi
Statistic Value ASE —————————————————— Gamma 0.2938 0.0561 Kendall’s Tau-b 0.0337 0.0065 Stuart’s Tau-c 0.0077 0.0015
Somers’ D C|R 0.0077 0.0015 Somers’ D R|C 0.1471 0.0282
Pearson Correlation 0.0337 0.0065 Spearman Correlation 0.0337 0.0065
Lambda Asymmetric C|R 0.0000 0.0000 Lambda Asymmetric R|C 0.0077 0.0015 Lambda Symmetric 0.0075 0.0015
Uncertainty Coefficient C|R 0.0081 0.0032 Uncertainty Coefficient R|C 0.0008 0.0003 Uncertainty Coefficient Symmetric 0.0015 0.0006
Estimates of the Relative Risk (Row1/Row2)
Type of Study Value 95% Confidence Limits —————————————————————– Case-Control (Odds Ratio) 1.8321 1.4400 2.3308 Cohort (Col1 Risk) 1.8178 1.4330 2.3059 Cohort (Col2 Risk) 0.9922 0.9892 0.9953
Sample Size = 22071
The GENMOD Procedure
Model Information
Data Set WORK.ASPIRIN Distribution Poisson Link Function Log Dependent Variable count Observations Used 4
Criteria For Assessing Goodness Of Fit
Criterion DF Value Value/DF Deviance 1 25.3720 25.3720 Scaled Deviance 1 25.3720 25.3720 Pearson Chi-Square 1 25.0139 25.0139 Scaled Pearson X2 1 25.0139 25.0139 Log Likelihood 181827.7802
Algorithm converged. Analysis Of Parameter Estimates
Standard Wald 95% Confidence Chi- Parameter DF Estimate Error Limits Square Pr > ChiSq Intercept 1 0.6781 0.1188 0.4454 0.9109 32.60 <.0001 group 1 0.0003 0.0135 -0.0261 0.0267 0.00 0.9839 mi 1 4.3085 0.0588 4.1932 4.4238 5366.76 <.0001 Scale 0 1.0000 0.0000 1.0000 1.0000
NOTE: The scale parameter was held fixed. Observation Statistics
Observation count group mi Pred Xbeta Std HessWgt Lower Upper Resraw Reschi Resdev StResdev StReschi Reslik 1 189 1 1 146.48015 4.9868899 0.0588072 146.48015 130.5335 164.37492 42.519853 3.513196 3.3609942 4.784706 5.0013802 4.8956654 2 10845 1 2 10887.52 9.2953725 0.0095519 10887.52 10685.587 11093.269 -42.51991 -0.4075 -0.407766 -5.004648 -5.001387 -5.001409 3 104 2 1 146.51997 4.9871618 0.058807 146.51997 130.56904 164.41955 -42.51997 -3.512728 -3.707237 -5.278334 -5.001394 -5.139873
The GENMOD Procedure
Observation Statistics
Observation count group mi Pred Xbeta Std HessWgt Lower Upper Resraw Reschi Resdev StResdev StReschi Reslik 4 10933 2 2 10890.48 9.2956443 0.0095506 10890.48 10688.519 11096.257 42.519913 0.4074449 0.4071802 4.998138 5.0013872 5.0013656
Inputting the Smoking and MI data, table 2.4, p. 26.
data smoking; input group mi count @@; cards; 1 1 172 1 2 173 2 1 90 2 2 346 ; run;
Calculations of odds ratio, p. 26-27.
Note: The option or in the exact statement is necessary in order to get the odds ratio in the output.
proc format; value group 1='Smoker' 2='Non-smoker'; value mi 1='MI' 2='Control'; run; proc freq data=smoking order=data; format group group. mi mi.; weight count; tables group*mi / chisq expected measures nopercent norow nocol; exact or; run;
The FREQ ProcedureTable of group by mi group mi
Frequency | Expected |MI |Control | Total ———–+——–+——–+ Smoker | 172 | 173 | 345 | 115.74 | 229.26 | ———–+——–+——–+ Non-smoker | 90 | 346 | 436 | 146.26 | 289.74 | ———–+——–+——–+ Total 262 519 781
Statistics for Table of group by mi
Statistic DF Value Prob —————————————————— Chi-Square 1 73.7287 <.0001 Likelihood Ratio Chi-Square 1 74.2583 <.0001 Continuity Adj. Chi-Square 1 72.4241 <.0001 Mantel-Haenszel Chi-Square 1 73.6343 <.0001 Phi Coefficient 0.3073 Contingency Coefficient 0.2937 Cramer’s V 0.3073
Fisher’s Exact Test ———————————- Cell (1,1) Frequency (F) 172 Left-sided Pr <= F 1.0000 Right-sided Pr >= F 6.762E-18
Table Probability (P) 1.888E-17 Two-sided Pr <= P 1.029E-17
The FREQ Procedure
Statistics for Table of group by mi
Statistic Value ASE —————————————————— Gamma 0.5853 0.0526 Kendall’s Tau-b 0.3073 0.0343 Stuart’s Tau-c 0.2882 0.0328
Somers’ D C|R 0.2921 0.0332 Somers’ D R|C 0.3232 0.0359
Pearson Correlation 0.3073 0.0343 Spearman Correlation 0.3073 0.0343
Lambda Asymmetric C|R 0.0000 0.0000 Lambda Asymmetric R|C 0.2377 0.0410 Lambda Symmetric 0.1351 0.0239
Uncertainty Coefficient C|R 0.0745 0.0168 Uncertainty Coefficient R|C 0.0693 0.0157 Uncertainty Coefficient Symmetric 0.0718 0.0162
Estimates of the Relative Risk (Row1/Row2)
Type of Study Value 95% Confidence Limits —————————————————————– Case-Control (Odds Ratio) 3.8222 2.7934 5.2299 Cohort (Col1 Risk) 2.4152 1.9532 2.9864 Cohort (Col2 Risk) 0.6319 0.5629 0.7093
Odds Ratio (Case-Control Study) ———————————– Odds Ratio 3.8222
Asymptotic Conf Limits 95% Lower Conf Limit 2.7934 95% Upper Conf Limit 5.2299
Exact Conf Limits 95% Lower Conf Limit 2.7607 95% Upper Conf Limit 5.2984
Sample Size = 781
Inputting the General Social Survey data, table 2.5, p. 31.
data Survey; input Gender Party count @@; cards; 1 1 279 1 2 73 1 3 225 2 1 165 2 2 47 2 3 191 ; run;
Chi-square test of independence, p. 31.
proc format; value gender 1='female' 2='Male'; value party 1='Democrat' 2='Independent' 3='Republican'; run; proc freq data=survey order=data; format gender gender. party party.; weight count; tables gender*party / chisq expected nopercent norow nocol; run;
The FREQ ProcedureTable of Gender by Party Gender Party
Frequency| Expected |Democrat|Independ|Republic| Total | |ent |an | ———+——–+——–+——–+ female | 279 | 73 | 225 | 577 | 261.42 | 70.653 | 244.93 | ———+——–+——–+——–+ Male | 165 | 47 | 191 | 403 | 182.58 | 49.347 | 171.07 | ———+——–+——–+——–+ Total 444 120 416 980
Statistics for Table of Gender by Party Statistic DF Value Prob —————————————————— Chi-Square 2 7.0095 0.0301 Likelihood Ratio Chi-Square 2 7.0026 0.0302 Mantel-Haenszel Chi-Square 1 6.7581 0.0093 Phi Coefficient 0.0846 Contingency Coefficient 0.0843 Cramer’s V 0.0846
Sample Size = 980
Creating 2×2 tables from the Survey data, p. 33.
Note: G2 is the likelihood ratio chi-square in the output.
data DemoInd; input Gender Party count @@; cards; 1 1 279 1 2 73 2 1 165 2 2 47 ; run; data Collapse; input Gender combo count @@; cards; 1 1 352 1 2 225 2 1 212 2 2 191 ; run; proc format; value combo 1='Demo./Ind.' 2='Republican'; run; proc freq data=DemoInd order=data; format gender gender. party party.; weight count; tables gender*party / chisq expected nopercent norow nocol; run; proc freq data=collapse order=data; format gender gender. combo combo.; weight count; tables gender*combo / chisq expected nopercent norow nocol; run;
The FREQ ProcedureTable of Gender by Party Gender Party
Frequency| Expected |Democrat|Independ| Total | |ent | ———+——–+——–+ female | 279 | 73 | 352 | 277.11 | 74.894 | ———+——–+——–+ Male | 165 | 47 | 212 | 166.89 | 45.106 | ———+——–+——–+ Total 444 120 564
Statistics for Table of Gender by Party Statistic DF Value Prob —————————————————— Chi-Square 1 0.1618 0.6875 Likelihood Ratio Chi-Square 1 0.1612 0.6881 Continuity Adj. Chi-Square 1 0.0876 0.7672 Mantel-Haenszel Chi-Square 1 0.1615 0.6878 Phi Coefficient 0.0169 Contingency Coefficient 0.0169 Cramer’s V 0.0169
Fisher’s Exact Test ———————————- Cell (1,1) Frequency (F) 279 Left-sided Pr <= F 0.6957 Right-sided Pr >= F 0.3819
Table Probability (P) 0.0776 Two-sided Pr <= P 0.7501
Sample Size = 564
The FREQ Procedure
Table of Gender by combo Gender combo
Frequency| Expected |Demo./In|Republic| Total |d. |an | ———+——–+——–+ female | 352 | 225 | 577 | 332.07 | 244.93 | ———+——–+——–+ Male | 212 | 191 | 403 | 231.93 | 171.07 | ———+——–+——–+ Total 564 416 980
Statistics for Table of Gender by combo Statistic DF Value Prob —————————————————— Chi-Square 1 6.8528 0.0089 Likelihood Ratio Chi-Square 1 6.8414 0.0089 Continuity Adj. Chi-Square 1 6.5133 0.0107 Mantel-Haenszel Chi-Square 1 6.8458 0.0089 Phi Coefficient 0.0836 Contingency Coefficient 0.0833 Cramer’s V 0.0836 Fisher’s Exact Test ———————————- Cell (1,1) Frequency (F) 352 Left-sided Pr <= F 0.9963 Right-sided Pr >= F 0.0054
Table Probability (P) 0.0017 Two-sided Pr <= P 0.0104
Sample Size = 980
Inputting the Infants data, table 2.7, p. 35.
data infants; input malform alcohol count @@; cards; 1 0 17066 2 0 48 1 0.5 14464 2 0.5 38 1 1.5 788 2 1.5 5 1 4.0 126 2 4.0 1 1 7.0 37 2 7.0 1 ; run;
Results, p. 36. The G2 statistic is the Likelihood ratio Chi-square in the first table of the output and the X2 statistic is the Chi-square in the same table. The sample correlation, r, is in the table labeled Pearson Correlation Coefficient. The M2 statistic is the statistic in the last table in the output labeled Cochran-Mantel-Haenszel Statistics.
proc format; value malform 2='Present' 1='Absent'; value Alcohol 0='0' 0.5='<1' 1.5='1-2' 4.0='3-5' 7.0='>=6'; run; proc freq data = infants; format malform malform. alcohol alcohol.; weight count; tables alcohol*malform / chisq cmh1 norow nocol nopercent; test pcorr ; run;
The FREQ ProcedureTable of alcohol by malform alcohol malform
Frequency|Absent |Present | Total ———+——–+——–+ 0 | 17066 | 48 | 17114 ———+——–+——–+ <1 | 14464 | 38 | 14502 ———+——–+——–+ 1-2 | 788 | 5 | 793 ———+——–+——–+ 3-5 | 126 | 1 | 127 ———+——–+——–+ >=6 | 37 | 1 | 38 ———+——–+——–+ Total 32481 93 32574
Statistics for Table of alcohol by malform Statistic DF Value Prob —————————————————— Chi-Square 4 12.0821 0.0168 Likelihood Ratio Chi-Square 4 6.2020 0.1846 Mantel-Haenszel Chi-Square 1 6.5699 0.0104 Phi Coefficient 0.0193 Contingency Coefficient 0.0193 Cramer’s V 0.0193
WARNING: 30% of the cells have expected counts less than 5. Chi-Square may not be a valid test.
The FREQ Procedure
Statistics for Table of alcohol by malform
Statistic Value ASE —————————————————— Gamma 0.0571 0.1010 Kendall’s Tau-b 0.0032 0.0058 Stuart’s Tau-c 0.0004 0.0006
Somers’ D C|R 0.0003 0.0006 Somers’ D R|C 0.0311 0.0556
Pearson Correlation 0.0142 0.0106 Spearman Correlation 0.0033 0.0059
Lambda Asymmetric C|R 0.0000 0.0000 Lambda Asymmetric R|C 0.0000 0.0000 Lambda Symmetric 0.0000 0.0000
Uncertainty Coefficient C|R 0.0049 0.0048 Uncertainty Coefficient R|C 0.0001 0.0001 Uncertainty Coefficient Symmetric 0.0002 0.0002
Pearson Correlation Coefficient ——————————– Correlation 0.0142 ASE 0.0106 95% Lower Conf Limit -0.0066 95% Upper Conf Limit 0.0350
Test of H0: Correlation = 0 ASE under H0 0.0107 Z 1.3226 One-sided Pr > Z 0.0930 Two-sided Pr > |Z| 0.1860
Sample Size = 32574
The FREQ Procedure
Summary Statistics for alcohol by malform Cochran-Mantel-Haenszel Statistics (Based on Table Scores)
Statistic Alternative Hypothesis DF Value Prob ————————————————————— 1 Nonzero Correlation 1 6.5699 0.0104
Total Sample Size = 32574
Equally spaced row scores for the Infants data set, p. 37.
data infantsx; input malform alcoholx count @@; cards; 1 0 17066 2 0 48 1 1 14464 2 1 38 1 2 788 2 2 5 1 3 126 2 3 1 1 4 37 2 4 1 ; run; proc freq data = infantsx; weight count; tables alcoholx*malform / cmh1 norow nocol nopercent; run;
The FREQ ProcedureTable of alcoholx by malform alcoholx malform
Frequency| 1| 2| Total ———+——–+——–+ 0 | 17066 | 48 | 17114 ———+——–+——–+ 1 | 14464 | 38 | 14502 ———+——–+——–+ 2 | 788 | 5 | 793 ———+——–+——–+ 3 | 126 | 1 | 127 ———+——–+——–+ 4 | 37 | 1 | 38 ———+——–+——–+ Total 32481 93 32574
Summary Statistics for alcoholx by malform Cochran-Mantel-Haenszel Statistics (Based on Table Scores)
Statistic Alternative Hypothesis DF Value Prob ————————————————————— 1 Nonzero Correlation 1 1.8278 0.1764
Total Sample Size = 32574
Using the midrank scores, p. 37. It is the option scores=ridit that tells SAS to use midrank scores.
proc format; value malform 2='Present' 1='Absent'; value Alcohol 0='0' 0.5='<1' 1.5='1-2' 4.0='3-5' 7.0='>=6'; run; proc freq data=infants; format malform malform. alcohol alcohol.; weight count; tables alcohol*malform / cmh1 scores=ridit norow nocol nopercent; test pcorr; run;
The FREQ ProcedureTable of alcohol by malform alcohol malform
Frequency|Absent |Present | Total ———+——–+——–+ 0 | 17066 | 48 | 17114 ———+——–+——–+ <1 | 14464 | 38 | 14502 ———+——–+——–+ 1-2 | 788 | 5 | 793 ———+——–+——–+ 3-5 | 126 | 1 | 127 ———+——–+——–+ >=6 | 37 | 1 | 38 ———+——–+——–+ Total 32481 93 32574
Statistics for Table of alcohol by malform Statistic Value ASE —————————————————— Gamma 0.0571 0.1010 Kendall’s Tau-b 0.0032 0.0058 Stuart’s Tau-c 0.0004 0.0006
Somers’ D C|R 0.0003 0.0006 Somers’ D R|C 0.0311 0.0556
Pearson Correlation (Ridit Scores) 0.0033 0.0059 Spearman Correlation 0.0033 0.0059
Lambda Asymmetric C|R 0.0000 0.0000 Lambda Asymmetric R|C 0.0000 0.0000 Lambda Symmetric 0.0000 0.0000
Uncertainty Coefficient C|R 0.0049 0.0048 Uncertainty Coefficient R|C 0.0001 0.0001 Uncertainty Coefficient Symmetric 0.0002 0.0002
The FREQ Procedure
Statistics for Table of alcohol by malform
Pearson Correlation Coefficient (Ridit Scores) ——————————– Correlation 0.0033 ASE 0.0059 95% Lower Conf Limit -0.0082 95% Upper Conf Limit 0.0148
Test of H0: Correlation = 0 ASE under H0 0.0059 Z 0.5583 One-sided Pr > Z 0.2883 Two-sided Pr > |Z| 0.5766
Sample Size = 32574
Summary Statistics for alcohol by malform
Cochran-Mantel-Haenszel Statistics (Based on Ridit Scores)
Statistic Alternative Hypothesis DF Value Prob ————————————————————— 1 Nonzero Correlation 1 0.3514 0.5533
Total Sample Size = 32574
Inputting Tea-Tasting Experiment data, table 2.8, p. 40.
Calculating exact test, p-values in table 2.9, p. 41.
Note: For tables having small cell counts (n < 5), the Exact option performs an exact test of independence that treats the variable as nominal. For 2×2 tables this is Fisher’s exact test.
Note2: The p-value and chi-square test statistic was only calculated for the data in table 2.8 not for all the possibilities for n11 as seen in table 2.9, p. 41.
data tea; input poured guess count @@; cards; 1 1 3 1 2 1 2 1 1 2 2 3 ; proc freq data=tea; weight count; tables poured*guess / exact; run;
The FREQ ProcedureTable of poured by guess poured guess
Frequency| Percent | Row Pct | Col Pct | 1| 2| Total ———+——–+——–+ 1 | 3 | 1 | 4 | 37.50 | 12.50 | 50.00 | 75.00 | 25.00 | | 75.00 | 25.00 | ———+——–+——–+ 2 | 1 | 3 | 4 | 12.50 | 37.50 | 50.00 | 25.00 | 75.00 | | 25.00 | 75.00 | ———+——–+——–+ Total 4 4 8 50.00 50.00 100.00
Statistics for Table of poured by guess Statistic DF Value Prob —————————————————— Chi-Square 1 2.0000 0.1573 Likelihood Ratio Chi-Square 1 2.0930 0.1480 Continuity Adj. Chi-Square 1 0.5000 0.4795 Mantel-Haenszel Chi-Square 1 1.7500 0.1859 Phi Coefficient 0.5000 Contingency Coefficient 0.4472 Cramer’s V 0.5000
WARNING: 100% of the cells have expected counts less than 5. Chi-Square may not be a valid test. Fisher’s Exact Test ———————————- Cell (1,1) Frequency (F) 3 Left-sided Pr <= F 0.9857 Right-sided Pr >= F 0.2429
Table Probability (P) 0.2286 Two-sided Pr <= P 0.4857
Sample Size = 8