Page 126. Regression from chapter 6.
use https://stats.idre.ucla.edu/stat/stata/examples/cama4/lung, clear
generate ffev1a = ffev1/100
regress ffev1a fheight
Source | SS df MS Number of obs = 150
-------------+------------------------------ F( 1, 148) = 50.50
Model | 16.0531702 1 16.0531702 Prob > F = 0.0000
Residual | 47.0451258 148 .317872472 R-squared = 0.2544
-------------+------------------------------ Adj R-squared = 0.2494
Total | 63.098296 149 .423478497 Root MSE = .5638
------------------------------------------------------------------------------
ffev1a | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
fheight | .1181052 .0166194 7.11 0.000 .0852633 .1509472
_cons | -4.086702 1.151979 -3.55 0.001 -6.363155 -1.81025
------------------------------------------------------------------------------
Page 128. Descriptive statistics at the bottom of the page.
summarize fage fheight ffev1a
Variable | Obs Mean Std. Dev. Min Max
-------------+-----------------------------------------------------
fage | 150 40.13333 6.889995 26 59
fheight | 150 69.26 2.779189 61 76
ffev1a | 150 4.093267 .6507523 2.5 5.85
Page 133. Covariance and correlation matrices.
Covariance:
correlate fage fheight fweight ffev1a, covariance
(obs=150)
| fage fheight fweight ffev1a
-------------+------------------------------------
fage | 47.472
fheight | -1.07517 7.72389
fweight | -3.64922 34.6954 573.798
ffev1a | -1.38762 .912232 2.06716 .423478
Correlation (page 134):
correlate fage fheight fweight ffev1a
(obs=150)
| fage fheight fweight ffev1a
-------------+------------------------------------
fage | 1.0000
fheight | -0.0561 1.0000
fweight | -0.0221 0.5212 1.0000
ffev1a | -0.3095 0.5044 0.1326 1.0000
Table 7.1, page 138.
regress ffev1a fheight fage
Source | SS df MS Number of obs = 150
-------------+------------------------------ F( 2, 147) = 36.81
Model | 21.056968 2 10.528484 Prob > F = 0.0000
Residual | 42.041328 147 .285995429 R-squared = 0.3337
-------------+------------------------------ Adj R-squared = 0.3247
Total | 63.098296 149 .423478497 Root MSE = .53479
------------------------------------------------------------------------------
ffev1a | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
fheight | .114397 .015789 7.25 0.000 .0831943 .1455997
fage | -.0266393 .0063687 -4.18 0.000 -.0392254 -.0140532
_cons | -2.760746 1.137746 -2.43 0.016 -5.009197 -.5122958
------------------------------------------------------------------------------
Page 140. The t-test at the top of the page.
NOTE: This is given in the output above.
Table 7.5, page 150.
NOTE: We need to reshape the data from wide to long to get the first panel of the table. We use the Stata command reshape to do this. We use the @ symbol before the variables that we wish to reshape as a "wild card" to collect all of the age variables, for example, regardless of the prefix (in this case, "f" and "m"). Before we reshape the data, however, we need to drop the variables for the children so that the will not be picked up by the "wild card". We use the string option because the "j" variable, gender, is a string variable.
drop oc* mc* yc*
reshape long @age @height @fev1, i(id) j(momdad) string
generate gender = 2 if momdad == "m"
replace gender = 1 if momdad == "f"
label define gend 1 "male" 2 "female"
label values gender gend
generate fev1a = fev1/100
tabstat age height fev1a, statistics(mean sd)
stats | age height fev1a
---------+------------------------------
mean | 38.84667 66.67667 3.5332
sd | 6.912484 3.685657 .8025855
----------------------------------------
regress fev1a age height
Source | SS df MS Number of obs = 300
-------------+------------------------------ F( 2, 297) = 197.57
Model | 109.953774 2 54.976887 Prob > F = 0.0000
Residual | 82.6451491 297 .278266495 R-squared = 0.5709
-------------+------------------------------ Adj R-squared = 0.5680
Total | 192.598923 299 .644143556 Root MSE = .52751
------------------------------------------------------------------------------
fev1a | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | -.0185978 .0044429 -4.19 0.000 -.0273413 -.0098542
height | .164865 .0083327 19.79 0.000 .1484664 .1812635
_cons | -6.736985 .5632885 -11.96 0.000 -7.845528 -5.628443
------------------------------------------------------------------------------
To obtain the second and third panels of the table, we need sort the data by gender and then use the by prefix to do the descriptive statistics and regressions for each gender.
sort gender
by gender: tabstat age height fev1a, statistics(mean sd)
------------------------------------------------------------------------------------------------
-> gender = male
stats | age height fev1a
---------+------------------------------
mean | 40.13333 69.26 4.093267
sd | 6.889995 2.779189 .6507523
----------------------------------------
------------------------------------------------------------------------------------------------
-> gender = female
stats | age height fev1a
---------+------------------------------
mean | 37.56 64.09333 2.973133
sd | 6.714184 2.469537 .4874136
----------------------------------------
by gender: regress fev1a age height
------------------------------------------------------------------------------------------------
-> gender = male
Source | SS df MS Number of obs = 150
-------------+------------------------------ F( 2, 147) = 36.81
Model | 21.056968 2 10.528484 Prob > F = 0.0000
Residual | 42.041328 147 .285995429 R-squared = 0.3337
-------------+------------------------------ Adj R-squared = 0.3247
Total | 63.098296 149 .423478497 Root MSE = .53479
------------------------------------------------------------------------------
fev1a | Coef. Std. Err. t P>|t| Beta
-------------+----------------------------------------------------------------
age | -.0266393 .0063687 -4.18 0.000 -.2820504
height | .114397 .015789 7.25 0.000 .4885592
_cons | -2.760746 1.137746 -2.43 0.016 .
------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
-> gender = female
Source | SS df MS Number of obs = 150
-------------+------------------------------ F( 2, 147) = 30.24
Model | 10.3185252 2 5.15926259 Prob > F = 0.0000
Residual | 25.0797019 147 .170610217 R-squared = 0.2915
-------------+------------------------------ Adj R-squared = 0.2819
Total | 35.3982271 149 .237571994 Root MSE = .41305
------------------------------------------------------------------------------
fev1a | Coef. Std. Err. t P>|t| Beta
-------------+----------------------------------------------------------------
age | -.0199755 .0050405 -3.96 0.000 -.2751644
height | .0925926 .0137042 6.76 0.000 .4691313
_cons | -2.21116 .896067 -2.47 0.015 .
------------------------------------------------------------------------------
Page 152. Middle of the page.
NOTE: The coefficient and standard error for the height variable from the analysis above (.093) and the one below (.114) are used in the calculation of the Z test.
xi: regress fev1a age i.gender*height
i.gender _Igender_1-2 (naturally coded; _Igender_1 omitted)
i.gender*height _IgenXheigh_# (coded as above)
Source | SS df MS Number of obs = 300
-------------+------------------------------ F( 4, 295) = 137.39
Model | 125.325155 4 31.3312887 Prob > F = 0.0000
Residual | 67.2737685 295 .228046673 R-squared = 0.6507
-------------+------------------------------ Adj R-squared = 0.6460
Total | 192.598923 299 .644143556 Root MSE = .47754
------------------------------------------------------------------------------
fev1a | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | -.0233887 .0040701 -5.75 0.000 -.0313988 -.0153786
_Igender_2 | .8296771 1.410069 0.59 0.557 -1.945392 3.604746
height | .1148495 .0140881 8.15 0.000 .0871236 .1425754
_IgenXheig~2 | -.0221023 .0212056 -1.04 0.298 -.0638357 .0196312
_cons | -2.922545 .9965403 -2.93 0.004 -4.883774 -.9613153
------------------------------------------------------------------------------
Page 153. Middle of the page.
xi: regress fev1a i.gender*height i.gender*age
i.gender _Igender_1-2 (naturally coded; _Igender_1 omitted)
i.gender*height _IgenXheigh_# (coded as above)
i.gender*age _IgenXage_# (coded as above)
Source | SS df MS Number of obs = 300
-------------+------------------------------ F( 5, 294) = 109.92
Model | 125.477893 5 25.0955786 Prob > F = 0.0000
Residual | 67.1210299 294 .228302823 R-squared = 0.6515
-------------+------------------------------ Adj R-squared = 0.6456
Total | 192.598923 299 .644143556 Root MSE = .47781
------------------------------------------------------------------------------
fev1a | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_Igender_2 | .5495863 1.451823 0.38 0.705 -2.307697 3.40687
height | .114397 .0141069 8.11 0.000 .0866338 .1421602
_IgenXheig~2 | -.0218044 .0212206 -1.03 0.305 -.063568 .0199592
_Igender_2 | (dropped)
age | -.0266393 .0056902 -4.68 0.000 -.0378381 -.0154406
_IgenXage_2 | .0066639 .0081472 0.82 0.414 -.0093704 .0226981
_cons | -2.760746 1.016532 -2.72 0.007 -4.761349 -.7601439
------------------------------------------------------------------------------
test _Igender_2 _IgenXheigh_2 _IgenXage_2
( 1) _Igender_2 = 0
( 2) _IgenXheigh_2 = 0
( 3) _IgenXage_2 = 0
F( 3, 294) = 22.67
Prob > F = 0.0000
