Statistical Methods and Data Analytics
UCLA Office of Advanced Research Computing
Stratified 1 - level Cluster Sampling design (with replacement)
With (30) clusters.
svydesign(id = ~sdmvpsu, weights = ~wtint2yr, strata = ~sdmvstra,
nest = TRUE, survey.lonely.psu = "adjust", data = nhanes2021)
Probabilities:
Min. 1st Qu. Median Mean 3rd Qu. Max.
5.849e-06 2.956e-05 4.615e-05 5.219e-05 6.978e-05 2.181e-04
Stratum Sizes:
173 174 175 176 177 178 179 180 181 182 183 184 185 186 187
obs 837 832 790 744 832 797 821 797 799 799 696 957 774 676 782
design.PSU 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
actual.PSU 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
Data variables:
[1] "dmdborn4" "dmdeduc2" "dmdhhsiz" "dmdhragz" "dmdhredz" "dmdhrgnd"
[7] "dmdhrmaz" "dmdhsedz" "dmdmartz" "dmdyrusr" "dmqmiliz" "female"
[13] "indfmpir" "ohq620" "ohq630" "ohq640" "ohq660" "ohq670"
[19] "ohq680" "ohq845" "pad680" "pad790q" "pad790u" "pad800"
[25] "pad810q" "pad810u" "pad820" "riagendr" "ridagemn" "ridageyr"
[31] "ridexagm" "ridexmon" "ridexprg" "ridreth1" "ridreth3" "ridstatr"
[37] "rxq510" "rxq515" "rxq520" "sddsrvyr" "sdmvpsu" "sdmvstra"
[43] "seqn" "wtint2yr" "wtmec2yr"
[1] 0
mean SE
ridageyr 38.989 0.5149
[1] 204
mean SE
ohq620 4.1774 0.02
[1] 3868
mean SE
pad680 363.49 5.4261
interaction(female, dmdborn4, ridexprg, dmdeduc2)
0.0.0.1 1.0.0.1 0.1.0.1 1.1.0.1 0.0.1.1 1.0.1.1
0.00 824620.69 0.00 43684.05 0.00 0.00
0.1.1.1 1.1.1.1 0.0.0.2 1.0.0.2 0.1.0.2 1.1.0.2
0.00 0.00 0.00 861261.38 0.00 1122922.04
0.0.1.2 1.0.1.2 0.1.1.2 1.1.1.2 0.0.0.3 1.0.0.3
0.00 0.00 0.00 84449.45 0.00 2060327.67
0.1.0.3 1.1.0.3 0.0.1.3 1.0.1.3 0.1.1.3 1.1.1.3
0.00 5732701.94 0.00 146333.46 0.00 435114.57
0.0.0.4 1.0.0.4 0.1.0.4 1.1.0.4 0.0.1.4 1.0.1.4
0.00 1380815.39 0.00 10247378.16 0.00 71985.69
0.1.1.4 1.1.1.4 0.0.0.5 1.0.0.5 0.1.0.5 1.1.0.5
0.00 236404.64 0.00 3591233.65 0.00 13480603.96
0.0.1.5 1.0.1.5 0.1.1.5 1.1.1.5
0.00 59540.46 0.00 522743.95
[]
dmqmiliz rxq510 female pad680 se
0.0.0 0 0 0 345.0032 11.938892
1.0.0 1 0 0 366.2002 16.668693
0.1.0 0 1 0 377.0661 9.220377
1.1.0 1 1 0 404.9886 13.913967
0.0.1 0 0 1 348.9575 7.288462
1.0.1 1 0 1 385.2752 35.287306
0.1.1 0 1 1 369.0723 9.474665
1.1.1 1 1 1 369.9402 30.816511
Stratified 1 - level Cluster Sampling design (with replacement)
With (30) clusters.
subset(nhc, female == 0)
Probabilities:
Min. 1st Qu. Median Mean 3rd Qu. Max.
5.849e-06 2.822e-05 4.460e-05 5.073e-05 6.848e-05 2.181e-04
Stratum Sizes:
173 174 175 176 177 178 179 180 181 182 183 184 185 186 187
obs 381 416 364 358 381 391 405 377 372 359 344 430 343 311 343
design.PSU 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
actual.PSU 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
Data variables:
[1] "dmdborn4" "dmdeduc2" "dmdhhsiz" "dmdhragz" "dmdhredz" "dmdhrgnd"
[7] "dmdhrmaz" "dmdhsedz" "dmdmartz" "dmdyrusr" "dmqmiliz" "female"
[13] "indfmpir" "ohq620" "ohq630" "ohq640" "ohq660" "ohq670"
[19] "ohq680" "ohq845" "pad680" "pad790q" "pad790u" "pad800"
[25] "pad810q" "pad810u" "pad820" "riagendr" "ridagemn" "ridageyr"
[31] "ridexagm" "ridexmon" "ridexprg" "ridreth1" "ridreth3" "ridstatr"
[37] "rxq510" "rxq515" "rxq520" "sddsrvyr" "sdmvpsu" "sdmvstra"
[43] "seqn" "wtint2yr" "wtmec2yr"
mean SE
pad680 368.01 7.7944
mean SE
ohq630 4.5052 0.0190
ohq620 4.1281 0.0237
Design-based one-sample t-test
data: I(ohq630 - ohq620) ~ 0
t = 23.107, df = 14, p-value = 1.506e-12
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
0.3421406 0.4121551
sample estimates:
mean
0.3771479
Call:
svyglm(formula = ridageyr ~ female, design = nhc)
Survey design:
svydesign(id = ~sdmvpsu, weights = ~wtint2yr, strata = ~sdmvstra,
nest = TRUE, survey.lonely.psu = "adjust", data = nhanes2021)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 38.1088 0.5197 73.327 < 2e-16 ***
female 1.7315 0.2720 6.365 1.75e-05 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for gaussian family taken to be 520.9637)
Number of Fisher Scoring iterations: 2
Call:
svyglm(formula = ridageyr ~ female, design = nhc)
Survey design:
svydesign(id = ~sdmvpsu, weights = ~wtint2yr, strata = ~sdmvstra,
nest = TRUE, survey.lonely.psu = "adjust", data = nhanes2021)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 38.1088 0.5197 73.327 < 2e-16 ***
female 1.7315 0.2720 6.365 1.75e-05 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for gaussian family taken to be 520.9637)
Number of Fisher Scoring iterations: 2
Call:
svyglm(formula = ridageyr ~ 1, design = nhc)
Survey design:
svydesign(id = ~sdmvpsu, weights = ~wtint2yr, strata = ~sdmvstra,
nest = TRUE, survey.lonely.psu = "adjust", data = nhanes2021)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 38.9893 0.5149 75.72 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for gaussian family taken to be 521.7131)
Number of Fisher Scoring iterations: 2
Call:
svyglm(formula = pad680 ~ female, design = nhc, na.action = na.omit)
Survey design:
svydesign(id = ~sdmvpsu, weights = ~wtint2yr, strata = ~sdmvstra,
nest = TRUE, survey.lonely.psu = "adjust", data = nhanes2021)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 368.010 7.794 47.215 <2e-16 ***
female -8.781 6.102 -1.439 0.172
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for gaussian family taken to be 49964.36)
Number of Fisher Scoring iterations: 2
Design-based t-test
data: pad680 ~ female
t = -0.90362, df = 14, p-value = 0.3815
alternative hypothesis: true difference in mean is not equal to 0
95 percent confidence interval:
-22.688225 9.237606
sample estimates:
difference in mean
-6.72531
Call:
svyglm(formula = pad680 ~ female + pad800, design = nhc, na.action = na.omit)
Survey design:
svydesign(id = ~sdmvpsu, weights = ~wtint2yr, strata = ~sdmvstra,
nest = TRUE, survey.lonely.psu = "adjust", data = nhanes2021)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 389.6686 11.6894 33.335 5.61e-14 ***
female -18.1188 7.7054 -2.351 0.0351 *
pad800 -0.3272 0.0575 -5.690 7.42e-05 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for gaussian family taken to be 46878.07)
Number of Fisher Scoring iterations: 2
Call:
svyglm(formula = pad680 ~ female * pad800, design = nhc, na.action = na.omit)
Survey design:
svydesign(id = ~sdmvpsu, weights = ~wtint2yr, strata = ~sdmvstra,
nest = TRUE, survey.lonely.psu = "adjust", data = nhanes2021)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 393.25805 13.00529 30.238 1.07e-12 ***
female -25.72500 10.84002 -2.373 0.035199 *
pad800 -0.37573 0.07615 -4.934 0.000346 ***
female:pad800 0.12010 0.07645 1.571 0.142160
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for gaussian family taken to be 46862.61)
Number of Fisher Scoring iterations: 2
Stratified 1 - level Cluster Sampling design (with replacement)
With (30) clusters.
svydesign(id = ~sdmvpsu, weights = ~wtint2yr, strata = ~sdmvstra,
nest = TRUE, survey.lonely.psu = "adjust", data = nhanes2021)
Call: svyglm(formula = pad680 ~ female * pad800, design = nhc, na.action = na.omit)
Coefficients:
(Intercept) female pad800 female:pad800
393.2581 -25.7250 -0.3757 0.1201
Degrees of Freedom: 6339 Total (i.e. Null); 12 Residual
(5593 observations deleted due to missingness)
Null Deviance: 300500000
Residual Deviance: 297100000 AIC: 86400
Call:
svyglm(formula = pad680 ~ factor(female) * factor(dmdhrmaz),
design = nhc, na.action = na.omit)
Survey design:
svydesign(id = ~sdmvpsu, weights = ~wtint2yr, strata = ~sdmvstra,
nest = TRUE, survey.lonely.psu = "adjust", data = nhanes2021)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 396.866 17.316 22.919 5.64e-10 ***
factor(female)1 -17.001 16.670 -1.020 0.3318
factor(dmdhrmaz)2 -70.221 34.156 -2.056 0.0668 .
factor(dmdhrmaz)3 -5.949 49.498 -0.120 0.9067
factor(female)1:factor(dmdhrmaz)2 57.922 49.112 1.179 0.2655
factor(female)1:factor(dmdhrmaz)3 -63.635 63.464 -1.003 0.3397
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for gaussian family taken to be 25424.33)
Number of Fisher Scoring iterations: 2
Design-based median test
data: pad680 ~ female
t = -0.34042, df = 14, p-value = 0.7386
alternative hypothesis: true difference in mean rank score is not equal to 0
sample estimates:
difference in mean rank score
-0.005688636
logit1 <- (svyglm(tghealth~factor(dmdeduc2)+ridageyr, family=quasibinomial,
design=nhc1, na.action = na.omit))
summary(logit1)
Call:
svyglm(formula = tghealth ~ factor(dmdeduc2) + ridageyr, design = nhc1,
family = quasibinomial, na.action = na.omit)
Survey design:
svydesign(id = ~sdmvpsu, weights = ~wtint2yr, strata = ~sdmvstra,
nest = TRUE, survey.lonely.psu = "adjust", data = nhanes2021)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.250701 0.220296 -5.677 0.000205 ***
factor(dmdeduc2)2 0.095245 0.218273 0.436 0.671848
factor(dmdeduc2)3 0.621578 0.185369 3.353 0.007326 **
factor(dmdeduc2)4 0.975127 0.173591 5.617 0.000222 ***
factor(dmdeduc2)5 1.795251 0.193574 9.274 3.16e-06 ***
ridageyr -0.005801 0.001958 -2.963 0.014219 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for quasibinomial family taken to be 1.147022)
Number of Fisher Scoring iterations: 4
subset1 <- subset(nhc1, ridageyr > 20)
logit2 <- (svyglm(tghealth~factor(dmdeduc2)+ridageyr,
family=quasibinomial, design=subset1, na.action = na.omit))
summary(logit2)
Call:
svyglm(formula = tghealth ~ factor(dmdeduc2) + ridageyr, design = subset1,
family = quasibinomial, na.action = na.omit)
Survey design:
subset(nhc1, ridageyr > 20)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.285928 0.217597 -5.910 0.000149 ***
factor(dmdeduc2)2 0.074280 0.207288 0.358 0.727538
factor(dmdeduc2)3 0.601982 0.191304 3.147 0.010391 *
factor(dmdeduc2)4 0.974348 0.175175 5.562 0.000240 ***
factor(dmdeduc2)5 1.801740 0.191866 9.391 2.82e-06 ***
ridageyr -0.005143 0.001891 -2.720 0.021576 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for quasibinomial family taken to be 1.001688)
Number of Fisher Scoring iterations: 4
ologit1 <- svyolr(factor(dmdeduc2)~factor(female)+factor(dmdborn4)+pad680,
design = nhc, method = c("logistic"))
summary(ologit1)Call:
svyolr(factor(dmdeduc2) ~ factor(female) + factor(dmdborn4) +
pad680, design = nhc, method = c("logistic"))
Coefficients:
Value Std. Error t value
factor(female)1 0.169926364 0.0472898498 3.593295
factor(dmdborn4)1 0.234984893 0.1373352880 1.711031
pad680 0.001808744 0.0001728276 10.465596
Intercepts:
Value Std. Error t value
1|2 -2.4906 0.2214 -11.2488
2|3 -1.3229 0.1911 -6.9208
3|4 0.3635 0.1564 2.3249
4|5 1.6192 0.1773 9.1339
(4226 observations deleted due to missingness)
Call:
svyglm(formula = pad800 ~ female, design = nhc, family = poisson())
Survey design:
svydesign(id = ~sdmvpsu, weights = ~wtint2yr, strata = ~sdmvstra,
nest = TRUE, survey.lonely.psu = "adjust", data = nhanes2021)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.30368 0.02881 149.376 < 2e-16 ***
female -0.27657 0.02811 -9.839 1.14e-07 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for poisson family taken to be 67.8306)
Number of Fisher Scoring iterations: 5
Standard deviations (1, .., p=3):
[1] 1.4538469 0.8541645 0.3958942
Rotation (n x k) = (3 x 3):
PC1 PC2 PC3
pad680 -0.4433069 -0.8948138 -0.0527938
ohq620 -0.6281802 0.3521469 -0.6938171
ohq630 -0.6394283 0.2744099 0.7182135
[]
[]