First example on income of four married couples, Table 16.1.
data temp; /*creating Table 16.2 */
array values{4} (6 -3 5 3);
array y{4} y1-y4;
do i =1 to 4;
y1=values[i];
do j=1 to 4;
y2=values[j];
do k=1 to 4;
y3=values[k];
do l=1 to 4;
y4=values[l];
output;
end;
end;
end;
end;
drop i j k l values1-values4;
run;
data stat; /*creaing mean for each sample*/
set temp;
m=mean(of y1-y4);
run;
proc means data=stat vardef=N; /*Bootstrapping Means, known distribution*/
var m;
run;
The MEANS Procedure
Analysis Variable : m
N Mean Std Dev Minimum Maximum
-------------------------------------------------------------------
256 2.7500000 1.7455300 -3.0000000 6.0000000
proc univariate data=stat noprint;
histogram m / midpoints=-3.8 to 8 by .4 href=2.75 lhref=1
haxis=axis1 cfill=green;
label m='Boostrap Mean';
run;
Second example of 10 married couples based on Table 16.3. First we run proc means to show the sample mean, standard deviation and normal-theory 95% confidence interval. Then we run the SAS macro boot to get percentile interval and improved boostrap interval for the mean. Macro boot is a part of program jackboot. In order to run boot, we need first create a macro called analyze in which we specify which data set we are going use and which statistic we want to analyze. After running boot, we can further run bootci to get estimate on confidence intervals using different methods. By default, macro boot also produces histogram on bootstrapping replicates of the mean.
data couples; input husinc wifinc @ diff; diff=husinc-wifinc; cards; 24 18 14 17 40 35 44 41 24 18 19 9 21 10 22 30 30 23 24 15 ; run; proc means data=couples mean stddev clm alpha=0.05; var diff; run;
The MEANS Procedure
Analysis Variable : diff
Lower 95% Upper 95%
Mean Std Dev CL for Mean CL for Mean
-----------------------------------------------------------
4.6000000 5.9479221 0.3451128 8.8548872
-----------------------------------------------------------
%include 'jackboot.sas';
%macro analyze(data=, out=);
proc means noprint data=&data;
output out=&out (drop=_freq_ _type_) mean=mean_diff;
var diff;
%bystmt;
run;
%mend;
title2 'Normal Confidence Interval';
%boot(data=couples, samples=2000); /*normal-theory C-I given here*/
title2 'Percentile Confidence Interval';
%bootci(PCTL);/*Percentile Intervals*/
title2 'Improved Bootstrap Confidence Interval';
%bootci(BCa); /*improved Bootstrap Intervals*/
Normal Confidence Interval
Frequency
| **
| ** **
| ** **
240 + ** ** **
| ** ** **
| ** ** ** **
| ** ** ** **
210 + ** ** ** **
| ** ** ** ** **
| ** ** ** ** **
| ** ** ** ** **
180 + ** ** ** ** **
| ** ** ** ** ** **
| ** ** ** ** ** **
| ** ** ** ** ** ** **
150 + ** ** ** ** ** ** **
| ** ** ** ** ** ** **
| ** ** ** ** ** ** **
| ** ** ** ** ** ** **
120 + ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** **
90 + ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** ** **
60 + ** ** ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** ** ** ** **
30 + ** ** ** ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** **
-----------------------------------------------------------------------------------
- - - -
2 1 1 0 0 0 1 1 2 3 3 4 4 5 6 6 7 7 8 9
. . . . . . . . . . . . . . . . . . . .
4 8 2 6 0 6 2 8 4 0 6 2 8 4 0 6 2 8 4 0
mean_diff Midpoint
Normal Confidence Interval
Approximate
Approximate Lower
Observed Bootstrap Approximate Standard Confidence Bias-Corrected
Name Statistic Mean Bias Error Limit Statistic
mean_diff 4.6 4.6 6.2172E-15 1.79316 1.08548 4.6
Approximate
Upper Method for Minimum Maximum
Confidence Confidence Confidence Resampled Resampled Number of
Name Limit Level (%) Interval Estimate Estimate Resamples
mean_diff 8.11452 95 Bootstrap Normal -2.4 9.1 2000
Percentile Confidence Interval
Approximate Approximate
Lower Upper Method for
Observed Confidence Confidence Confidence Confidence Number of
Name Statistic Limit Limit Level (%) Interval Resamples
mean_diff 4.6 0.8 7.8 95 Bootstrap PCTL 2000
Improved Bootstrap Confidence Interval
Approximate Approximate
Lower Upper Method for
Observed Confidence Confidence Confidence Confidence
Name Statistic Limit Limit Level (%) Interval
mean_diff 4.6 -0.1 7.4 95 Bootstrap BCa
Lower Upper Bias
Number of Percentile Percentile Correction
Name Resamples Point Point (Z0) Acceleration
mean_diff 2000 .009875210 0.95183 -0.056429 -0.056302
Bootstrapping regression using data file duncan. The results below are different from Table 16.5 since we only use the ordinary regression procedure instead of Huber robust regression.
%macro analyze(data=, out=);
proc reg noprint data=&data outest=&out (drop= prestige _IN_ _P_ _EDF_ _RMSE_);
model prestige=income educ;
%bystmt;
run;
%mend;
%boot(data=duncan, samples=2000);
%bootci(PCTL);
%bootci(BC);
%bootci(BCa);
Improved Bootstrap Confidence Interval
Frequency
| **
| ** **
300 + ** **
| ** **
| ** **
| ** ** **
| ** ** **
250 + ** ** ** **
| ** ** ** **
| ** ** ** **
| ** ** ** **
| ** ** ** ** **
200 + ** ** ** ** **
| ** ** ** ** **
| ** ** ** ** **
| ** ** ** ** **
| ** ** ** ** ** **
150 + ** ** ** ** ** **
| ** ** ** ** ** ** **
| ** ** ** ** ** ** **
| ** ** ** ** ** ** **
| ** ** ** ** ** ** **
100 + ** ** ** ** ** ** **
| ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** **
50 + ** ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** ** ** ** ** ** ** **
-----------------------------------------------------------------------------------
- - - - - -
1 1 1 1 1 1 - - - - - - - -
6 5 4 3 2 0 9 8 7 6 4 3 2 1 0 1 2 3 4 6
. . . . . . . . . . . . . . . . . . . .
8 6 4 2 0 8 6 4 2 0 8 6 4 2 0 2 4 6 8 0
Intercept
Improved Bootstrap Confidence Interval
Frequency
300 + **
| ** **
| ** **
| ** ** **
270 + ** ** **
| ** ** **
| ** ** **
| ** ** **
240 + ** ** **
| ** ** **
| ** ** ** **
| ** ** ** ** **
210 + ** ** ** ** **
| ** ** ** ** **
| ** ** ** ** **
| ** ** ** ** **
180 + ** ** ** ** **
| ** ** ** ** **
| ** ** ** ** **
| ** ** ** ** **
150 + ** ** ** ** ** ** **
| ** ** ** ** ** ** **
| ** ** ** ** ** ** **
| ** ** ** ** ** ** **
120 + ** ** ** ** ** ** **
| ** ** ** ** ** ** **
| ** ** ** ** ** ** **
| ** ** ** ** ** ** ** **
90 + ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** **
60 + ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** ** ** ** **
30 + ** ** ** ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** **
---------------------------------------------------------------------------------------
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1
. . . . . . . . . . . . . . . . . . . . .
1 1 2 3 3 4 4 5 6 6 7 7 8 9 9 0 0 1 2 2 3
2 8 4 0 6 2 8 4 0 6 2 8 4 0 6 2 8 4 0 6 2
income Midpoint
Improved Bootstrap Confidence Interval
Frequency
| **
| **
350 + **
| **
| **
| ** **
| ** **
300 + ** ** **
| ** ** ** **
| ** ** ** **
| ** ** ** **
| ** ** ** **
250 + ** ** ** **
| ** ** ** **
| ** ** ** **
| ** ** ** **
| ** ** ** **
200 + ** ** ** **
| ** ** ** ** **
| ** ** ** ** ** **
| ** ** ** ** ** **
| ** ** ** ** ** **
150 + ** ** ** ** ** **
| ** ** ** ** ** **
| ** ** ** ** ** **
| ** ** ** ** ** **
| ** ** ** ** ** ** **
100 + ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** **
50 + ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** ** ** **
| ** ** ** ** ** ** ** ** ** ** ** ** ** **
-------------------------------------------------------------------------------
- - -
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
. . . . . . . . . . . . . . . . . . .
1 0 0 0 0 1 2 2 3 3 4 5 5 6 6 7 8 8 9
5 9 3 3 9 5 1 7 3 9 5 1 7 3 9 5 1 7 3
educ Midpoint
Improved Bootstrap Confidence Interval
Approximate Approximate
Approximate Lower Upper
Observed Bootstrap Approximate Standard Confidence Bias-Corrected Confidence
Name Statistic Mean Bias Error Limit Statistic Limit
Intercept -6.06466 -6.07837 -0.013709 3.06968 -12.0674 -6.05095 -0.03448
educ 0.54583 0.53178 -0.014057 0.13606 0.2932 0.55989 0.82657
income 0.59873 0.61524 0.016504 0.16511 0.2586 0.58223 0.90584
Method for Minimum Maximum LABEL OF
Confidence Confidence Resampled Resampled Number of FORMER
Name Level (%) Interval Estimate Estimate Resamples VARIABLE
Intercept 95 Bootstrap Normal -16.7532 6.23388 2000 Intercept
educ 95 Bootstrap Normal -0.1439 0.93903 2000
income 95 Bootstrap Normal 0.1101 1.32727 2000
Improved Bootstrap Confidence Interval
Approximate Approximate
Lower Upper Method for LABEL OF
Observed Confidence Confidence Confidence Confidence Number of FORMER
Name Statistic Limit Limit Level (%) Interval Resamples VARIABLE
Intercept -6.06466 -12.1006 0.04230 95 Bootstrap PCTL 2000 Intercept
educ 0.54583 0.2458 0.78030 95 Bootstrap PCTL 2000
income 0.59873 0.3113 0.95969 95 Bootstrap PCTL 2000
Improved Bootstrap Confidence Interval
Approximate Approximate
Lower Upper Method for
Observed Confidence Confidence Confidence Confidence
Name Statistic Limit Limit Level (%) Interval
Intercept -6.06466 -12.1006 0.04230 95 Bootstrap BC
educ 0.54583 0.2603 0.78737 95 Bootstrap BC
income 0.59873 0.2963 0.94522 95 Bootstrap BC
LABEL OF Lower Upper Bias
Number of FORMER Percentile Percentile Correction
Name Resamples VARIABLE Point Point (Z0)
Intercept 2000 Intercept 0.025000 0.97500 -0.000000
educ 2000 0.030589 0.97971 0.043880
income 2000 0.019567 0.96835 -0.051409
Improved Bootstrap Confidence Interval
Estimated Estimated
Estimated Lower Upper
Observed Jackknife Estimated Standard Confidence Bias-Corrected Confidence
Name Statistic Mean Bias Error Limit Statistic Limit
Intercept -6.06466 -6.06599 -0.058182 3.11843 -12.1185 -6.00648 0.10552
educ 0.54583 0.54549 -0.014920 0.14778 0.2711 0.56075 0.85040
income 0.59873 0.59914 0.017827 0.18140 0.2254 0.58091 0.93644
Method for Minimum Maximum LABEL OF
Confidence Confidence Resampled Resampled Number of FORMER
Name Level (%) Interval Estimate Estimate Resamples VARIABLE
Intercept 95 Jackknife -7.34563 -5.24007 45 Intercept
educ 95 Jackknife 0.43303 0.58637 45
income 95 Jackknife 0.54404 0.73155 45
Improved Bootstrap Confidence Interval
Approximate Approximate
Lower Upper Method for
Observed Confidence Confidence Confidence Confidence Number of
Name Statistic Limit Limit Level (%) Interval Resamples
Intercept -6.06466 -11.7872 0.45600 95 Bootstrap BCa 2000
educ 0.54583 0.3030 0.82933 95 Bootstrap BCa 2000
income 0.59873 0.2519 0.90165 95 Bootstrap BCa 2000
LABEL OF Lower Upper Bias
FORMER Percentile Percentile Correction
Name VARIABLE Point Point (Z0) Acceleration
Intercept Intercept 0.028650 0.97845 -0.000000 0.015823
educ 0.052622 0.99235 0.043880 0.079128
income 0.007763 0.94720 -0.051409 -0.074962
