Here are two major weighting methods that can be used by stat pacjages in their OLS regression analyses:
- Analytic weights (aweights). Analytic weights are inversely proportional to the variance of an observation. In regression aweights are often used on aggregated data, say, state level data. They are used to adjust for difference population sizes or for unequal variances.
- Probability sampling weights (pweights). Probability or sampling weights are the inverse of the probability that an observation from a population will be included in the sample. These weights are used for sample survey designs.
By default SPSS uses something like aweights for their regression procedure. Stata can use aweights or pweights.
There are a number of sites on the web that recommend using working weights (wwt) in SPSS to approximate results that would be obtained using pweights.
Working weights are analytic weights divided by the mean weight. Supposedly, working weights provide better estimates of standard errors than using plain aweights. In fact, it seems to work reasonably well producing results similar to aweights in Stata, however model F-ratios are very different from Stata pweights or svyreg.
For these examples wt = socst/20 and the working weight wwt = wt/2.62025.
SPSS
/* no weights */
WEIGHT
OFF.
REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT write
/METHOD=ENTER read female .
Model Summary
R | R Square | Adjusted R Square | Std. Error of the Estimate |
.663(a) | .439 | .434 | 7.13273 |
ANOVA(b)
| Sum of Squares | df | Mean Square | F | Sig. |
Regression | 7856.321 | 2 | 3928.161 | 77.211 | .000(a) |
Residual | 10022.554 | 197 | 50.876 | | |
Total | 17878.875 | 199 | | | |
Coefficients(a)
| B | Std. Error | Beta | t | Sig. |
(Constant) | 20.228 | 2.714 | | 7.454 | .000 |
reading score | .566 | .049 | .612 | 11.459 | .000 |
female | 5.487 | 1.014 | .289 | 5.410 | .000 |
/* weight using analytic weight wt */
WEIGHT
BY wt .
REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT write
/METHOD=ENTER read female .
Model Summary
R | R Square | Adjusted R Square | Std. Error of the Estimate |
.653(a) | .427 | .425 | 6.89843 |
ANOVA(b)
| Sum of Squares | df | Mean Square | F | Sig. |
Regression | 18474.435 | 2 | 9237.218 | 194.107 | .000(a) |
Residual | 24795.893 | 521 | 47.588 | | |
Total | 43270.328 | 523 | | | |
Coefficients(a)
| B | Std. Error | Beta | t | Sig. |
(Constant) | 22.179 | 1.658 | | 13.376 | .000 |
reading score | .544 | .029 | .613 | 18.467 | .000 |
female | 4.818 | .607 | .264 | 7.940 | .000 |
/* weight using working wwt */
WEIGHT
BY wwt .
REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT write
/METHOD=ENTER read female .
Model Summary
R | R Square | Adjusted R Square | Std. Error of the Estimate |
.653(a) | .427 | .421 | 6.93083 |
ANOVA(b)
| Sum of Squares | df | Mean Square | F | Sig. |
Regression | 7050.638 | 2 | 3525.319 | 73.388 | .000(a) |
Residual | 9463.178 | 197 | 48.036 | | |
Total | 16513.817 | 199 | | | |
Coefficients(a)
| B | Std. Error | Beta | t | Sig. |
(Constant) | 22.179 | 2.697 | | 8.225 | .000 |
reading score | .544 | .048 | .613 | 11.355 | .000 |
female | 4.818 | .987 | .264 | 4.882 | .000 |
Stata
gen wt=socst/20
sum wt
gen wwt=wt/r(mean)
/* no weights */
regress write read female
Source | SS df MS Number of obs = 200
-------------+------------------------------ F( 2, 197) = 77.21
Model | 7856.32118 2 3928.16059 Prob > F = 0.0000
Residual | 10022.5538 197 50.8759077 R-squared = 0.4394
-------------+------------------------------ Adj R-squared = 0.4337
Total | 17878.875 199 89.843593 Root MSE = 7.1327
------------------------------------------------------------------------------
write | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
read | .5658869 .0493849 11.46 0.000 .468496 .6632778
female | 5.486894 1.014261 5.41 0.000 3.48669 7.487098
_cons | 20.22837 2.713756 7.45 0.000 14.87663 25.58011
-----------------------------------------------------------------------------
/* weighted using wt as an aweight */
regress write read female [aw=wt]
(sum of wgt is 5.2405e+02)
Source | SS df MS Number of obs = 200
-------------+------------------------------ F( 2, 197) = 73.39
Model | 7050.6384 2 3525.3192 Prob > F = 0.0000
Residual | 9463.17824 197 48.0364377 R-squared = 0.4270
-------------+------------------------------ Adj R-squared = 0.4211
Total | 16513.8166 199 82.9840032 Root MSE = 6.9308
------------------------------------------------------------------------------
write | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
read | .5438772 .0478975 11.36 0.000 .4494195 .6383349
female | 4.818009 .9868682 4.88 0.000 2.871827 6.764191
_cons | 22.1789 2.696658 8.22 0.000 16.86087 27.49692
------------------------------------------------------------------------------
/* weighted using wwt as an aweight */
regress write read female [aw=wwt]
(sum of wgt is 2.0000e+02)
Source | SS df MS Number of obs = 200
-------------+------------------------------ F( 2, 197) = 73.39
Model | 7050.63836 2 3525.31918 Prob > F = 0.0000
Residual | 9463.17822 197 48.0364377 R-squared = 0.4270
-------------+------------------------------ Adj R-squared = 0.4211
Total | 16513.8166 199 82.9840029 Root MSE = 6.9308
------------------------------------------------------------------------------
write | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
read | .5438772 .0478975 11.36 0.000 .4494195 .6383349
female | 4.818009 .9868682 4.88 0.000 2.871827 6.764191
_cons | 22.1789 2.696658 8.22 0.000 16.86087 27.49692
------------------------------------------------------------------------------
/* weighted using wt as a pweight */
regress write read female [pw=wt]
(sum of wgt is 5.2405e+02)
Regression with robust standard errors Number of obs = 200
F( 2, 197) = 95.46
Prob > F = 0.0000
R-squared = 0.4270
Root MSE = 6.9308
------------------------------------------------------------------------------
| Robust
write | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
read | .5438772 .0418554 12.99 0.000 .4613351 .6264194
female | 4.818009 .99909 4.82 0.000 2.847725 6.788293
_cons | 22.1789 2.456592 9.03 0.000 17.3343 2
/* survey set using wt as a pweight */
svyset [pw=wt]
svy: regress write read female
Survey: Linear regression
Number of strata = 1 Number of obs = 200
Number of PSUs = 200 Population size = 524.04999
Design df = 199
F( 2, 198) = 95.94
Prob > F = 0.0000
R-squared = 0.4270
------------------------------------------------------------------------------
| Linearized
write | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
read | .5438772 .0416445 13.06 0.000 .461756 .6259984
female | 4.818009 .9940568 4.85 0.000 2.857772 6.778246
_cons | 22.1789 2.444216 9.07 0.000 17.35901 26.99879
------------------------------------------------------------------------------
Summaries of Results
Coefficient and Standard Error for read
| weight | stata | spss |
| none | .566 / .0494 | .566 / .049 |
| wt | .544 / .0479 aw | .544 / .029 |
| wwt | .544 / .0479 aw | .544 / .048 |
| wt | .544 / .0416 svyreg | |
| wt | .544 / .0419 pw |
Coefficient and Standard Error for female
| weight | stata | spss |
| none | 5.487 / 1.014 | 5.487 / 1.014 |
| wt | 4.818 / .987 aw | 4.818 / .607 |
| wwt | 4.818 / .987 aw | 4.818 / .987 |
| wt | 4.818 / .994 svyreg | |
| wt | 4.818 / .999 pw |
Model F-ratios
| Package | F |
| No wt | 77.21 |
| Stata pw | 95.46 |
| Stata svyreg | 95.94 |
| Stata aw | 73.39 |
| SPSS wwt | 73.39 |
| SPSS wt | 194.107 |
