Here is a tiny example showing how to use the survey commands in Stata. Consider the data file we call svysmall shown below.
use https://stats.idre.ucla.edu/stat/stata/faq/svysmall, clear list house eth wt y x1 x2 x3 1 1 .4 3 4 5 3 1 1 .9 9 4 5 6 2 1 1.2 9 8 7 3 2 1 1 8 7 4 2 2 1 1.1 8 7 6 3 3 2 .8 8 7 3 2 4 2 .4 8 2 0 3 4 2 .7 8 2 5 3
In this tiny example, house is the household, eth is the ethnicity, and wt is the weighting for the person. You can use the svyset commands to tell Stata about these things and it remembers them. If you save the data file, Stata remembers them with the data file and you don’t even need to enter them the next time you use the data file. Below, we tell Stata that the psu (primary sampling unit) is the household (house). Further, the sampling scheme included stratified sampling (strata) based on ethnicity (eth). Finally, the weighting variable (pweight) is called wt.
The way the svyset command is constructed is different between Stata version 7, 8 and 9. If you are not using Stata 9 or later, the syntax below will not work. Please see this page for examples. An example is given below. Notice that the PSU variable is given before the pweight, which is given in square brackets.
svyset house [pweight = wt], strata(eth)
Once Stata knows about the survey via the svyset commands, you can use the svy: prefix using syntax which is quite similar to the non-survey versions of the commands. For example, the svy: regress command below looks just like a regular regress command, but it uses the information you have provided about the survey design and does the computations taking those into consideration.
svy: regress y x1 x2 x3
The output is below, and it tells you the pweight, strata, and psu variables so you can confirm the right variables have been chosen.
Survey: Linear regression Number of strata = 2 Number of obs = 8 Number of PSUs = 4 Population size = 6.5000001 Design df = 2 F( 2, 1) = . Prob > F = . R-squared = 0.2216 ------------------------------------------------------------------------------ | Linearized y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- x1 | .3321757 .294268 1.13 0.376 -.9339573 1.598309 x2 | -.138397 .2335074 -0.59 0.613 -1.143098 .8663043 x3 | .5504173 .3170068 1.74 0.225 -.8135527 1.914387 _cons | 5.050307 2.040247 2.48 0.132 -3.728167 13.82878 ------------------------------------------------------------------------------