The examples below use Stata 9. If you are using Stata versions 7 or 8, please see this page.
NOTE: If you want to see the design effect or the misspecification effect, use estat effects after the command.
Chapter 3: Simple random sampling
page 53 simple random sampling
use https://stats.idre.ucla.edu/stat/books/sop/momsag.dta, clear
svyset [pweight=weight1], fpc(birth)
pweight: weight1
VCE: linearized
Strata 1: <one>
SU 1: <observations>
FPC 1: birth
svy: mean momsag
(running mean on estimation sample)
Survey: Mean estimation
Number of strata = 1 Number of obs = 25
Number of PSUs = 25 Population size = 773
Design df = 24
--------------------------------------------------------------
| Linearized
| Mean Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
momsag | .92 .0544746 .8075699 1.03243
--------------------------------------------------------------
svy: total momsag
(running total on estimation sample)
Survey: Total estimation
Number of strata = 1 Number of obs = 25
Number of PSUs = 25 Population size = 773
Design df = 24
--------------------------------------------------------------
| Linearized
| Total Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
momsag | 711.16 42.10889 624.2515 798.0685
--------------------------------------------------------------
page 69 simple random sample and estimation of parameters for subdomains
use https://stats.idre.ucla.edu/stat/books/sop/workers.dta, clear
svyset [pweight=wt1], fpc(popsize)
pweight: wt1
VCE: linearized
Strata 1: <one>
SU 1: <observations>
FPC 1: popsize
svy, over(exposure): mean fvc
(running mean on estimation sample)
Survey: Mean estimation
Number of strata = 1 Number of obs = 40
Number of PSUs = 40 Population size = 1200
Design df = 39
1: exposure = 1
2: exposure = 2
3: exposure = 3
--------------------------------------------------------------
| Linearized
Over | Mean Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
fvc |
1 | 83.33333 5.177446 72.86096 93.80571
2 | 83.5 4.110476 75.18578 91.81422
3 | 79.10714 2.321226 74.41202 83.80227
--------------------------------------------------------------
Chapter 4: Systematic sampling
page 106 repeated systematic sampling
input cluster xi wt1 xibar M 2 15 9 5 54 13 13 9 4.33 54 31 12 9 4.00 54 34 10 9 3.33 54 46 21 9 7 54 53 10 9 3.33 54 end
svyset [pweight = wt1], fpc(M)
pweight: wt1
VCE: linearized
Strata 1: <one>
SU 1: <observations>
FPC 1: M
svy: mean xibar
(running mean on estimation sample)
Survey: Mean estimation
Number of strata = 1 Number of obs = 6
Number of PSUs = 6 Population size = 54
Design df = 5
--------------------------------------------------------------
| Linearized
| Mean Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
xibar | 4.498333 .5310139 3.133318 5.863348
--------------------------------------------------------------
svy: total xi
(running total on estimation sample)
Survey: Total estimation
Number of strata = 1 Number of obs = 6
Number of PSUs = 6 Population size = 54
Design df = 5
--------------------------------------------------------------
| Linearized
| Total Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
xi | 729 85.94882 508.0615 949.9385
--------------------------------------------------------------
Chapter 5: Stratification and stratified random sampling
page 136 stratified random sampling
use https://stats.idre.ucla.edu/stat/books/sop/hospsamp.dta, clear
svyset [pweight=weighta], strata(oblevel) fpc(tothosp)
pweight: weighta
VCE: linearized
Strata 1: oblevel
SU 1: <observations>
FPC 1: tothosp
svy: total births
(running total on estimation sample)
Survey: Total estimation
Number of strata = 3 Number of obs = 15
Number of PSUs = 15 Population size = 158
Design df = 12
--------------------------------------------------------------
| Linearized
| Total Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
births | 183982.9 34014.33 109872 258093.8
--------------------------------------------------------------
svy, over(oblevel): total births
(running total on estimation sample)
Survey: Total estimation
Number of strata = 3 Number of obs = 15
Number of PSUs = 15 Population size = 158
Design df = 12
1: oblevel = 1
2: oblevel = 2
3: oblevel = 3
--------------------------------------------------------------
| Linearized
Over | Total Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
births |
1 | 14931 2669.857 9113.882 20748.12
2 | 117116.9 33067.66 45068.68 189165.2
3 | 51934.98 7508.399 35575.58 68294.37
--------------------------------------------------------------
Chapter 6: Stratified random sampling: Further issues
page 168 stratified random sampling
use https://stats.idre.ucla.edu/stat/books/sop/jacktwn2.dta, clear
svyset [pweight=sampwt], strata(stratum) fpc(npop)
pweight: sampwt
VCE: linearized
Strata 1: stratum
SU 1: <observations>
FPC 1: npop
svy: total twin
(running total on estimation sample)
Survey: Total estimation
Number of strata = 18 Number of obs = 831
Number of PSUs = 831 Population size = 256998
Design df = 813
--------------------------------------------------------------
| Linearized
| Total Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
twin | 26055.4 3791.044 18614.01 33496.78
--------------------------------------------------------------
svy, over(quart1): total twin
(running total on estimation sample)
Survey: Total estimation
Number of strata = 18 Number of obs = 831
Number of PSUs = 831 Population size = 256998
Design df = 813
1: quart1 = 1
2: quart1 = 2
3: quart1 = 3
--------------------------------------------------------------
| Linearized
Over | Total Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
twin |
1 | 19183.8 2661.629 13959.33 24408.28
2 | 6737.907 2696.605 1444.778 12031.04
3 | 133.687 126.7443 -115.0976 382.4715
--------------------------------------------------------------
Chapter 7: Ratio estimation
page 200 ratio estimation under simple random sampling
use https://stats.idre.ucla.edu/stat/books/sop/tab7pt1.dta, clear
svyset [pweight=wt1], fpc(totcnt)
pweight: wt1
VCE: linearized
Strata 1: <one>
SU 1: <observations>
FPC 1: totcnt
svy: ratio pharmexp/totmedex
(running ratio on estimation sample)
Survey: Ratio estimation
Number of strata = 1 Number of obs = 7
Number of PSUs = 7 Population size = 8
Design df = 6
_ratio_1: pharmexp/totmedex
--------------------------------------------------------------
| Linearized
| Ratio Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
_ratio_1 | .3191489 .0040067 .309345 .3289529
--------------------------------------------------------------
Chapter 9: Simple one-stage cluster sampling
page 247 simple one-stage cluster sampling
use https://stats.idre.ucla.edu/stat/books/sop/tab9_1a.dta, clear
svyset devlpmnt [pweight=wt1], fpc(M)
pweight: wt1
VCE: linearized
Strata 1: <one>
SU 1: devlpmnt
FPC 1: M
svy: total NVSTNRS NGE65
(running total on estimation sample)
Survey: Total estimation
Number of strata = 1 Number of obs = 40
Number of PSUs = 2 Population size = 100
Design df = 1
--------------------------------------------------------------
| Linearized
| Total Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
NVSTNRS | 57.5 1.936492 32.89454 82.10546
NGE65 | 167.5 1.936492 142.8945 192.1055
--------------------------------------------------------------
svy: mean NVSTNRS hhneedvn
(running mean on estimation sample)
Survey: Mean estimation
Number of strata = 1 Number of obs = 40
Number of PSUs = 2 Population size = 100
Design df = 1
--------------------------------------------------------------
| Linearized
| Mean Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
NVSTNRS | .575 .0193649 .3289454 .8210546
hhneedvn | .525 .0193649 .2789454 .7710546
--------------------------------------------------------------
svy: ratio NVSTNRS NGE65
(running ratio on estimation sample)
Survey: Ratio estimation
Number of strata = 1 Number of obs = 40
Number of PSUs = 2 Population size = 100
Design df = 1
_ratio_1: NVSTNRS/NGE65
--------------------------------------------------------------
| Linearized
| Ratio Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
_ratio_1 | .3432836 .0075924 .2468131 .4397541
--------------------------------------------------------------
svy: mean nge65dv
(running mean on estimation sample)
Survey: Mean estimation
Number of strata = 1 Number of obs = 40
Number of PSUs = 2 Population size = 100
Design df = 1
--------------------------------------------------------------
| Linearized
| Mean Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
nge65dv | 33.5 .3872983 28.57891 38.42109
--------------------------------------------------------------
Chapter 10: Two-stage cluster sampling: Clusters sampled with equal probability
page 285 two-stage cluster sampling (clusters sampled with equal probability)
input center nurse m nbar w npatnts nrefrred 1 2 5 3 2.5 44 6 1 3 5 3 2.5 18 6 2 1 5 3 2.5 42 3 2 3 5 3 2.5 10 2 4 1 5 3 2.5 16 5 4 2 5 3 2.5 32 14 end
svyset center [pweight=w]
pweight: w
VCE: linearized
Strata 1: <one>
SU 1: center
FPC 1: <zero>
svy: total nrefrred
(running total on estimation sample)
Survey: Total estimation
Number of strata = 1 Number of obs = 6
Number of PSUs = 3 Population size = 15
Design df = 2
--------------------------------------------------------------
| Linearized
| Total Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
nrefrred | 90 30.31089 -40.41723 220.4172
--------------------------------------------------------------
svy: ratio nrefrred npatnts
(running ratio on estimation sample)
Survey: Ratio estimation
Number of strata = 1 Number of obs = 6
Number of PSUs = 3 Population size = 15
Design df = 2
_ratio_1: nrefrred/npatnts
--------------------------------------------------------------
| Linearized
| Ratio Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
_ratio_1 | .2222222 .0812779 -.1274883 .5719328
--------------------------------------------------------------
page 310 two-stage cluster sampling (clusters sampled with equal probability)
use https://stats.idre.ucla.edu/stat/books/sop/i10pt2.dta, clear
svyset hospno [pweight=w]
pweight: w
VCE: linearized
Strata 1: <one>
SU 1: hospno
FPC 1: <zero>
svy: total dxdead
(running total on estimation sample)
Survey: Total estimation
Number of strata = 1 Number of obs = 708
Number of PSUs = 3 Population size = 23600
Design df = 2
--------------------------------------------------------------
| Linearized
| Total Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
dxdead | 499.3792 114.6776 5.961244 992.7971
--------------------------------------------------------------
svy: ratio dxdead lifethrt
(running ratio on estimation sample)
Survey: Ratio estimation
Number of strata = 1 Number of obs = 708
Number of PSUs = 3 Population size = 23600
Design df = 2
_ratio_1: dxdead/lifethrt
--------------------------------------------------------------
| Linearized
| Ratio Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
_ratio_1 | .1703017 .072273 -.140664 .4812674
--------------------------------------------------------------
page 350 cluster sampling with unequal probabilities: probability proportional to size sampling
This example uses the hospslct data set.
svyset drawing [pw=wstar]
pweight: wstar
VCE: linearized
Strata 1: <one>
SU 1: drawing
FPC 1: <zero>
svy: total lifethrt dxdead
(running total on estimation sample)
Survey: Total estimation
Number of strata = 1 Number of obs = 50
Number of PSUs = 5 Population size = 50056
Design df = 4
--------------------------------------------------------------
| Linearized
| Total Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
lifethrt | 6006.72 1001.12 3227.165 8786.275
dxdead | 2002.24 1226.117 -1402.005 5406.485
--------------------------------------------------------------
svy: mean lifethrt dxdead
(running mean on estimation sample)
Survey: Mean estimation
Number of strata = 1 Number of obs = 50
Number of PSUs = 5 Population size = 50056
Design df = 4
--------------------------------------------------------------
| Linearized
| Mean Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
lifethrt | .12 .02 .0644711 .1755289
dxdead | .04 .0244949 -.0280087 .1080087
--------------------------------------------------------------
svy: ratio dxdead/lifethrt
(running ratio on estimation sample)
Survey: Ratio estimation
Number of strata = 1 Number of obs = 50
Number of PSUs = 5 Population size = 50056
Design df = 4
_ratio_1: dxdead/lifethrt
--------------------------------------------------------------
| Linearized
| Ratio Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
_ratio_1 | .3333333 .2324056 -.311928 .9785946
--------------------------------------------------------------
Chapter 11: Cluster sampling in which clusters are sampled with unequal probability: Probability proportional to size sampling
page 351 cluster sampling with unequal probabilities (probability proportional to size sampling)
use https://stats.idre.ucla.edu/stat/books/sop/hospslct.dta, clear
svyset drawing [pweight=wstar]
pweight: wstar
VCE: linearized
Strata 1: <one>
SU 1: drawing
FPC 1: <zero>
svy: total lifethrt dxdead
(running total on estimation sample)
Survey: Total estimation
Number of strata = 1 Number of obs = 50
Number of PSUs = 5 Population size = 50056
Design df = 4
--------------------------------------------------------------
| Linearized
| Total Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
lifethrt | 6006.72 1001.12 3227.165 8786.275
dxdead | 2002.24 1226.117 -1402.005 5406.485
--------------------------------------------------------------
svy: ratio dxdead lifethrt
(running ratio on estimation sample)
Survey: Ratio estimation
Number of strata = 1 Number of obs = 50
Number of PSUs = 5 Population size = 50056
Design df = 4
_ratio_1: dxdead/lifethrt
--------------------------------------------------------------
| Linearized
| Ratio Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
_ratio_1 | .3333333 .2324056 -.311928 .9785946
--------------------------------------------------------------
Chapter 12: Variance estimation in complex sample surveys
page 369 linearization
use https://stats.idre.ucla.edu/stat/books/sop/exmp12_2.dta, clear
svyset [pweight=w], fpc(N)
pweight: w
VCE: linearized
Strata 1: <one>
SU 1: <observations>
FPC 1: N
svy: ratio ovpaymnt/payment
(running ratio on estimation sample)
Survey: Ratio estimation
Number of strata = 1 Number of obs = 10
Number of PSUs = 10 Population size = 65
Design df = 9
_ratio_1: ovpaymnt/payment
--------------------------------------------------------------
| Linearized
| Ratio Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
_ratio_1 | .3864296 .1158187 .1244294 .6484298
--------------------------------------------------------------
