Econometric Analysis, Fourth Edition by William Greene Chapter 19: Models with Discrete Dependent Variables

use "https://stats.idre.ucla.edu/stat/stata/examples/greene/TBL19-1.DTA", clear

Example 19.1 on page 816. The column for linear regression of Table 19.1 is obtained using a least square regression model.

regress  grade gpa tuce psi

      Source |       SS       df       MS              Number of obs =      32
-------------+------------------------------           F(  3,    28) =    6.65
       Model |  3.00227631     3  1.00075877           Prob > F      =  0.0016
    Residual |  4.21647369    28  .150588346           R-squared     =  0.4159
-------------+------------------------------           Adj R-squared =  0.3533
       Total |     7.21875    31  .232862903           Root MSE      =  .38806

------------------------------------------------------------------------------
       grade |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         gpa |   .4638517   .1619563     2.86   0.008     .1320992    .7956043
        tuce |   .0104951   .0194829     0.54   0.594    -.0294137    .0504039
         psi |   .3785548   .1391727     2.72   0.011     .0934724    .6636372
       _cons |  -1.498017   .5238886    -2.86   0.008    -2.571154   -.4248801
------------------------------------------------------------------------------

The slope for each parameter in the case of linear regression is simply the parameter itself since it is linear. The density function is simply the constant 1 for linear regression. The column for logistic regression of Table 19.1 is obtained using a logistic regression model.

logit grade gpa tuce psi

(Intermediate results omitted)

Logit estimates                                   Number of obs   =         32
                                                  LR chi2(3)      =      15.40
                                                  Prob > chi2     =     0.0015
Log likelihood = -12.889633                       Pseudo R2       =     0.3740

------------------------------------------------------------------------------
       grade |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         gpa |   2.826113   1.262941     2.24   0.025     .3507938    5.301432
        tuce |   .0951577   .1415542     0.67   0.501    -.1822835    .3725988
         psi |   2.378688   1.064564     2.23   0.025       .29218    4.465195
       _cons |  -13.02135   4.931325    -2.64   0.008    -22.68657    -3.35613
------------------------------------------------------------------------------

In Stata 7, we use the mfx command to get the marginal effect with option nodiscrete so the variable psi is treated as a continuous variable.

mfx compute, nodiscrete

Marginal effects after logit
      y  = Pr(grade) (predict)
         =  .25282025
------------------------------------------------------------------------------
variable |      dy/dx    Std. Err.     z    P>|z|  [    95% C.I.   ]      X
---------+--------------------------------------------------------------------
     gpa |   .5338589      .23704    2.25   0.024   .069273  .998445   3.11719
    tuce |   .0179755      .02624    0.69   0.493  -.033448  .069399   21.9375
     psi |   .4493393      .19676    2.28   0.022   .063691  .834987   .437500
------------------------------------------------------------------------------

We now calculate the scale factor of the last row in the table for logistic regression. Matrix x stores the coefficient estimates and matrix m stores the mean vector for the independent variables including the constant term.

matrix x=e(b)
matrix accum a=gpa tuce psi, mean(m)
(obs=32)

matrix list m
m[1,4]
             gpa       tuce        psi      _cons
_cons  3.1171875    21.9375      .4375          1

matrix c=x*m'
matrix list c
symmetric c[1,1]
        _cons
y1  -1.083627

Now we use the density function for the logistic distribution (19-11) on page 815 to get the scale factor below.

di exp(trace(c))/(1+exp(trace(c)))^2
.18890217

The column for probit regression of Table 19.1 is obtained using a probit model.

probit  grade gpa tuce psi

(Intermediate results omitted)

Probit estimates                                  Number of obs   =         32
                                                  LR chi2(3)      =      15.55
                                                  Prob > chi2     =     0.0014
Log likelihood = -12.818803                       Pseudo R2       =     0.3775

------------------------------------------------------------------------------
       grade |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         gpa |    1.62581   .6938818     2.34   0.019     .2658269    2.985794
        tuce |   .0517289   .0838901     0.62   0.537    -.1126927    .2161506
         psi |   1.426332    .595037     2.40   0.017     .2600814    2.592583
       _cons |   -7.45232   2.542467    -2.93   0.003    -12.43546   -2.469177
------------------------------------------------------------------------------

In the same way as in logistic regression, we can to get the marginal effect using mfx and treat psi as a continuous variable.

mfx compute, nodiscrete

Marginal effects after probit
      y  = Pr(grade) (predict)
         =  .26580809
------------------------------------------------------------------------------
variable |      dy/dx    Std. Err.     z    P>|z|  [    95% C.I.   ]      X
---------+--------------------------------------------------------------------
     gpa |   .5333471      .23246    2.29   0.022   .077726  .988968   3.11719
    tuce |   .0169697      .02712    0.63   0.531  -.036184  .070123   21.9375
     psi |   .4679084      .18764    2.49   0.013   .100136   .83568   .437500
------------------------------------------------------------------------------

Similar to the case of logistic regression, we can get the scale factor by the following calculation.

matrix x=e(b)
matrix accum a=gpa tuce psi, mean(m)
(obs=32)

matrix c=x*m'

Now we use the standard normal distribution for the probit model to get the scale factor.

di normalden(trace(c))
.32805002

Last column of Table 19.1 is obtained using a little bit more maximum likelihood programming in Stata.

gen d0 = 1-grade
gen  d1 = grade

capture program drop myweib
program define myweib

       version 6
       args lnf theta
       quietly replace `lnf'=-d0*exp(`theta')+d1*log(1-exp(-exp(`theta'))) 
       end
       
ml model lf myweib (grade=gpa tuce psi)
ml maximize

initial:       log likelihood = -26.045427
alternative:   log likelihood = -21.404965
rescale:       log likelihood = -20.686846
Iteration 0:   log likelihood = -20.686846  
Iteration 1:   log likelihood = -16.296453  
Iteration 2:   log likelihood = -13.074668  
Iteration 3:   log likelihood = -13.008153  
Iteration 4:   log likelihood = -13.008003  
Iteration 5:   log likelihood = -13.008003  

                                                  Number of obs   =         32
                                                  Wald chi2(3)    =      10.19
Log likelihood = -13.008003                       Prob > chi2     =     0.0170

------------------------------------------------------------------------------
       grade |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         gpa |   2.293553   1.035001     2.22   0.027     .2649881    4.322118
        tuce |    .041156   .1073136     0.38   0.701    -.1691748    .2514867
         psi |   1.562276   .7305064     2.14   0.032     .1305096    2.994042
       _cons |  -10.03142   3.479058    -2.88   0.004    -16.85025   -3.212591
------------------------------------------------------------------------------

Now let’s calculate the slope at mean, this is the last column in Table 19.1 Using the model specified in Example 19.1, we can derive the derivative and plug in the mean vector. Matrix x stores the coefficient estimates and matrix m stores the mean vector of independent variables including the constant term. Finally vector y stores the slope for each independent variable at its mean value. The last display shows the scale factor computed at the means of the variables.

matrix x=e(b)
matrix accum a=gpa tuce psi, mean(m)
(obs=32)

matrix c=x*m'
matrix list c

symmetric c[1,1]
         _cons
y1  -1.2956303

local d = exp(-exp(trace(c)))*exp(trace(c))
matrix y=`d'*x

matrix list y
y[1,4]
           eq1:        eq1:        eq1:        eq1:
           gpa        tuce         psi       _cons
y1   .47747024   .00856782    .3252335  -2.0883339

di exp(-exp(trace(c)))*exp(trace(c))
20817929

Example 19.2 on page 817.

gen ps1_0=norm(-7.45+1.62*gpa+.052*21.938)
gen ps1_1=norm(-7.45+1.62*gpa+.052*21.938+1.4263)
sort gpa
graph twoway scatter ps1_0 ps1_1 gpa if gpa>=2, msymbol(i i) connect(l l)

Example 19.4 and Table 19.2 on page 825.

logit  grade gpa tuce psi

(Intermediate results omitted)

Logit estimates                                   Number of obs   =         32
                                                  LR chi2(3)      =      15.40
                                                  Prob > chi2     =     0.0015
Log likelihood = -12.889633                       Pseudo R2       =     0.3740

------------------------------------------------------------------------------
       grade |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         gpa |   2.826113   1.262941     2.24   0.025     .3507938    5.301432
        tuce |   .0951577   .1415542     0.67   0.501    -.1822835    .3725988
         psi |   2.378688   1.064564     2.23   0.025       .29218    4.465195
       _cons |  -13.02135   4.931325    -2.64   0.008    -22.68657    -3.35613
------------------------------------------------------------------------------

mfx compute, nodiscrete

Marginal effects after logit
      y  = Pr(grade) (predict)
         =  .25282025
------------------------------------------------------------------------------
variable |      dy/dx    Std. Err.     z    P>|z|  [    95% C.I.   ]      X
---------+--------------------------------------------------------------------
     gpa |   .5338589      .23704    2.25   0.024   .069273  .998445   3.11719
    tuce |   .0179755      .02624    0.69   0.493  -.033448  .069399   21.9375
     psi |   .4493393      .19676    2.28   0.022   .063691  .834987   .437500
------------------------------------------------------------------------------

probit  grade gpa tuce psi

(Intermediate results omitted)

Probit estimates                                  Number of obs   =         32
                                                  LR chi2(3)      =      15.55
                                                  Prob > chi2     =     0.0014
Log likelihood = -12.818803                       Pseudo R2       =     0.3775

------------------------------------------------------------------------------
       grade |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         gpa |    1.62581   .6938818     2.34   0.019     .2658269    2.985794
        tuce |   .0517289   .0838901     0.62   0.537    -.1126927    .2161506
         psi |   1.426332    .595037     2.40   0.017     .2600814    2.592583
       _cons |   -7.45232   2.542467    -2.93   0.003    -12.43546   -2.469177
------------------------------------------------------------------------------

mfx compute, nodiscrete

Marginal effects after probit
      y  = Pr(grade) (predict)
         =  .26580809
------------------------------------------------------------------------------
variable |      dy/dx    Std. Err.     z    P>|z|  [    95% C.I.   ]      X
---------+--------------------------------------------------------------------
     gpa |   .5333471      .23246    2.29   0.022   .077726  .988968   3.11719
    tuce |   .0169697      .02712    0.63   0.531  -.036184  .070123   21.9375
     psi |   .4679084      .18764    2.49   0.013   .100136   .83568   .437500
------------------------------------------------------------------------------

Example 19.6 on page 826. The right-hand corner of the first output below gives the chi-square value and p-value for the probit model. The right-hand corner of the second output below gives the chi-square and p-value for the logit model.

probit grade gpa tuce psi

(Intermediate results omitted)

Probit estimates                                  Number of obs   =         32
                                                  LR chi2(3)      =      15.55
                                                  Prob > chi2     =     0.0014
Log likelihood = -12.818803                       Pseudo R2       =     0.3775

------------------------------------------------------------------------------
       grade |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         gpa |    1.62581   .6938818     2.34   0.019     .2658269    2.985794
        tuce |   .0517289   .0838901     0.62   0.537    -.1126927    .2161506
         psi |   1.426332    .595037     2.40   0.017     .2600814    2.592583
       _cons |   -7.45232   2.542467    -2.93   0.003    -12.43546   -2.469177
------------------------------------------------------------------------------

logit grade gpa tuce psi

(Intermediate results omitted)

Logit estimates                                   Number of obs   =         32
                                                  LR chi2(3)      =      15.40
                                                  Prob > chi2     =     0.0015
Log likelihood = -12.889633                       Pseudo R2       =     0.3740

------------------------------------------------------------------------------
       grade |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         gpa |   2.826113   1.262941     2.24   0.025     .3507938    5.301432
        tuce |   .0951577   .1415542     0.67   0.501    -.1822835    .3725988
         psi |   2.378688   1.064564     2.23   0.025       .29218    4.465195
       _cons |  -13.02135   4.931325    -2.64   0.008    -22.68657    -3.35613
------------------------------------------------------------------------------

Now let’s do the second part of the example. We first run the full model and then run separate models for the case when psi==0 and when psi==1. We store the log likelihood in a local variable for each model and compare them afterwards.

probit grade gpa tuce 

(Intermediate results omitted)

Probit estimates                                  Number of obs   =         32
                                                  LR chi2(2)      =       8.88
                                                  Prob > chi2     =     0.0118
Log likelihood = -16.152157                       Pseudo R2       =     0.2156

------------------------------------------------------------------------------
       grade |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         gpa |   1.409575   .6354662     2.22   0.027     .1640846    2.655066
        tuce |   .0526675   .0755527     0.70   0.486    -.0954132    .2007481
       _cons |  -6.034327   2.121027    -2.85   0.004    -10.19146   -1.877191
------------------------------------------------------------------------------

local l10 = e(ll)
probit grade gpa tuce if psi==0

(Intermediate results omitted)

Probit estimates                                  Number of obs   =         18
                                                  LR chi2(2)      =       8.90
                                                  Prob > chi2     =     0.0117
Log likelihood = -3.6612274                       Pseudo R2       =     0.5486

------------------------------------------------------------------------------
       grade |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         gpa |   3.092025   1.813587     1.70   0.088    -.4625411    6.646591
        tuce |   .1535286   .2708777     0.57   0.571     -.377382    .6844391
       _cons |  -14.90758    9.85688    -1.51   0.130    -34.22671    4.411545
------------------------------------------------------------------------------

local l0=e(ll)
probit grade gpa tuce if psi==1

(Intermediate results omitted)

Probit estimates                                  Number of obs   =         14
                                                  LR chi2(2)      =       2.69
                                                  Prob > chi2     =     0.2609
Log likelihood =  -8.217242                       Pseudo R2       =     0.1405

------------------------------------------------------------------------------
       grade |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         gpa |   1.102962   .7846481     1.41   0.160    -.4349205    2.640844
        tuce |   .0276161   .0984905     0.28   0.779    -.1654217    .2206539
       _cons |  -3.870036   2.861729    -1.35   0.176    -9.478921    1.738849
------------------------------------------------------------------------------

local l1=e(ll)
di 2*(`l1'+`l0'-`l10') 
8.5473743

di chi2tail(3, 8.54743)
03595439

We can also get the critical value for 95%.

di invchi2tail(3, .05)
7.8147278

Example 19.7 and Table 19.3 on page 830. The first estimate is obtained by running a probit model and the second one is obtained by running heteroscedastic probit model.

probit grade gpa tuce psi

(Intermediate results omitted)

Probit estimates                                  Number of obs   =         32
                                                  LR chi2(3)      =      15.55
                                                  Prob > chi2     =     0.0014
Log likelihood = -12.818803                       Pseudo R2       =     0.3775

------------------------------------------------------------------------------
       grade |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         gpa |    1.62581   .6938818     2.34   0.019     .2658269    2.985794
        tuce |   .0517289   .0838901     0.62   0.537    -.1126927    .2161506
         psi |   1.426332    .595037     2.40   0.017     .2600814    2.592583
       _cons |   -7.45232   2.542467    -2.93   0.003    -12.43546   -2.469177
------------------------------------------------------------------------------

hetprob  grade gpa tuce psi, het( psi)

Fitting comparison model:

Iteration 0:   log likelihood =  -20.59173
Iteration 1:   log likelihood = -13.315851
Iteration 2:   log likelihood = -12.832843
Iteration 3:   log likelihood = -12.818826
Iteration 4:   log likelihood = -12.818803

Fitting full model:

Iteration 0:   log likelihood = -12.818803  
Iteration 1:   log likelihood = -12.080094  
Iteration 2:   log likelihood = -11.965838  
Iteration 3:   log likelihood = -11.896545  
Iteration 4:   log likelihood = -11.895852  
Iteration 5:   log likelihood = -11.895851  

Heteroskedastic probit model                    Number of obs     =         32
                                                Zero outcomes     =         21
                                                Nonzero outcomes  =         11

                                                Wald chi2(3)      =       3.33
Log likelihood = -11.89585                      Prob > chi2       =     0.3438

------------------------------------------------------------------------------
       grade |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
grade        |
         gpa |    3.12155   1.760869     1.77   0.076    -.3296887    6.572789
        tuce |   .1237515   .2134227     0.58   0.562    -.2945493    .5420523
         psi |    2.34322   1.670631     1.40   0.161    -.9311565    5.617597
       _cons |  -14.28904   8.860899    -1.61   0.107    -31.65609    3.077997
-------------+----------------------------------------------------------------
lnsigma2     |
         psi |   1.093371   .8805796     1.24   0.214    -.6325333    2.819275
------------------------------------------------------------------------------
Likelihood ratio test of lnsigma2=0: chi2(1) =     1.85   Prob > chi2 = 0.1743

The first LR test comes as a part of the output from the hetprob procedure as shown above. Stata does not have a LM test yet. For Wald test, we can simply calculate it manually as shown in the book. Stata’s hetprobit has an option that does robust standard error estimate and Wald test for it as shown after our manual computation of Wald test.

di (1.09337/.8805795)^2
1.5416904

hetprob  grade gpa tuce psi, het( psi) robust nolog

Heteroskedastic probit model                    Number of obs     =         32
                                                Zero outcomes     =         21
                                                Nonzero outcomes  =         11

                                                Wald chi2(3)      =       9.51
Log likelihood = -11.89585                      Prob > chi2       =     0.0232

------------------------------------------------------------------------------
             |               Robust
       grade |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
grade        |
         gpa |   3.121551   1.129778     2.76   0.006     .9072259    5.335875
        tuce |   .1237516   .1428788     0.87   0.386    -.1562857    .4037888
         psi |   2.343221   1.502745     1.56   0.119    -.6021045    5.288546
       _cons |  -14.28905   4.881214    -2.93   0.003    -23.85605   -4.722045
-------------+----------------------------------------------------------------
lnsigma2     |
         psi |   1.093371   .7252922     1.51   0.132    -.3281753    2.514918
------------------------------------------------------------------------------
Wald test of lnsigma2=0:             chi2(1) =     2.27   Prob > chi2 = 0.1317

Example 19.18 on page 868 on nested logit. The independent variables are ttme — terminal time (zero for car), hinc — household income and gc — generalized cost. The dependent variable is mode–the choice that a person makes among four travel modes. The data set is in such a format that each person has as many observations as the number of choices. For instance, in our case, each person has four choices and there are 210 people in the dataset. Therefore, the total number of observations in this dataset is 210*4 = 840. In other words, each four observations in the dataset correspond to a choice that a person makes. In order to run the nested logit analysis in Stata 7, we need to create a couple of variables needed to build up the nested logit tree. First, we need a group variable that corresponds to a person. In our case, every four observations will be a group. This is done by using command generate with group function. Secondly we need a variable that corresponds to all the possible choices. To this end, we generate a variable called travel based on the group variable.

use "https://stats.idre.ucla.edu/stat/stata/examples/greene/TBL19-2.DTA", clear 

gen grp = group(210)
sort grp
by grp: gen travel=_n-1
list in 1/10

          mode       ttme       invc       invt         gc       hinc      psize        grp     travel
  1.         0         69         59        100         70         35          1          1          0
  2.         0         34         31        372         71         35          1          1          1
  3.         0         35         25        417         70         35          1          1          2
  4.         1          0         10        180         30         35          1          1          3
  5.         0         64         58         68         68         30          2          2          0
  6.         0         44         31        354         84         30          2          2          1
  7.         0         53         25        399         85         30          2          2          2
  8.         1          0         11        255         50         30          2          2          3
  9.         0         69        115        125        129         40          1          3          0
 10.         0         34         98        892        195         40          1          3          1
 
label define tlab 0 "air" 1 "train" 2 "bus" 3 "car"
label value travel tlab
tab travel

     travel |      Freq.     Percent        Cum.
------------+-----------------------------------
        air |        210       25.00       25.00
      train |        210       25.00       50.00
        bus |        210       25.00       75.00
        car |        210       25.00      100.00
------------+-----------------------------------
      Total |        840      100.00
      
nlogitgen type = travel(fly: 0, ground: 1 | 2 | 3)
new variable type is generated with 2 groups
lb_type:
           1 fly
           2 ground

nlogittree travel type
tree structure specified for the nested logit model

         top-->bottom

        type      travel
------------------------
         fly         air
      ground       train
                     bus
                     car

We have successfully built the nested logit tree now. We need more dummy variables for the choices and interaction of choices with household income. We create them below.

gen aasc = mod(_n, 4)==1
gen tasc = mod(_n, 4)==2
gen basc = mod(_n, 4)==3
gen casc = mod(_n, 4)==0
gen hincair = hinc*aasc

The following model corresponds to Table 19.12 on page 869, the unconditional estimate. Since the model is not nested, it gives the exact output as clogit shown below. In this case, we should use clogit since it is much faster.

nlogit mode (travel = aasc tasc basc gc ttme hincair), group(grp)
note: nonnested model, results are the same as -clogit-

initial:       log likelihood = -199.12837
rescale:       log likelihood = -199.12837
Iteration 0:   log likelihood = -199.12837  
Iteration 1:   log likelihood = -199.12837  

Nested logit
Levels             =          1                 Number of obs      =       840
Dependent variable =       mode                 LR chi2(6)         =  183.9869
Log likelihood     = -199.12837                 Prob > chi2        =    0.0000

------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
travel       |
        aasc |   5.207443   .7790551     6.68   0.000     3.680523    6.734363
        tasc |   3.869043   .4431269     8.73   0.000      3.00053    4.737555
        basc |   3.163194   .4502659     7.03   0.000     2.280689    4.045699
          gc |  -.0155015    .004408    -3.52   0.000     -.024141    -.006862
        ttme |  -.0961248   .0104398    -9.21   0.000    -.1165865   -.0756631
     hincair |    .013287   .0102624     1.29   0.195    -.0068269     .033401
-------------+----------------------------------------------------------------
(IV params)  |
             |
      (none) |
------------------------------------------------------------------------------
LR test of homoskedasticity (iv = 1): chi2(0)=    0.00    Prob > chi2 =      .
------------------------------------------------------------------------------

clogit mode aasc tasc basc gc ttme hincair, group(grp)

Iteration 0:   log likelihood = -262.32267
Iteration 1:   log likelihood = -203.53119
Iteration 2:   log likelihood =  -199.2783
Iteration 3:   log likelihood = -199.12868
Iteration 4:   log likelihood = -199.12837

Conditional (fixed-effects) logistic regression   Number of obs   =        840
                                                  LR chi2(6)      =     183.99
                                                  Prob > chi2     =     0.0000
Log likelihood = -199.12837                       Pseudo R2       =     0.3160

------------------------------------------------------------------------------
        mode |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        aasc |   5.207443   .7790525     6.68   0.000     3.680528    6.734358
        tasc |   3.869043   .4431252     8.73   0.000     3.000533    4.737552
        basc |   3.163194   .4502646     7.03   0.000     2.280692    4.045697
          gc |  -.0155015    .004408    -3.52   0.000     -.024141    -.006862
        ttme |  -.0961248   .0104398    -9.21   0.000    -.1165864   -.0756632
     hincair |    .013287   .0102624     1.29   0.195    -.0068269    .0334009
------------------------------------------------------------------------------

The following model corresponds to Table 19.12 on page 869, the FIML estimate.

nlogit mode (travel = aasc tasc basc gc ttme)  (type=hincair), group(grp)
tree structure specified for the nested logit model

         top-->bottom

        type      travel
------------------------
         fly         air
      ground       train
                     bus
                     car

initial:       log likelihood = -207.01917
rescale:       log likelihood = -207.01917
rescale eq:    log likelihood = -197.65917
Iteration 0:   log likelihood = -197.65917  
Iteration 1:   log likelihood = -197.43396  (backed up)
Iteration 2:   log likelihood = -197.18788  (backed up)
Iteration 3:   log likelihood = -197.17125  (backed up)
Iteration 4:   log likelihood = -197.08544  (backed up)
Iteration 5:   log likelihood = -196.84835  
Iteration 6:   log likelihood = -196.30583  
Iteration 7:   log likelihood = -194.93294  
Iteration 8:   log likelihood = -194.86168  
Iteration 9:   log likelihood =  -194.0501  
Iteration 10:  log likelihood = -193.91879  
Iteration 11:  log likelihood = -193.75031  
Iteration 12:  log likelihood = -193.68313  
Iteration 13:  log likelihood = -193.66092  
Iteration 14:  log likelihood = -193.65718  
Iteration 15:  log likelihood =  -193.6562  
Iteration 16:  log likelihood = -193.65615  
Iteration 17:  log likelihood = -193.65615  

Nested logit
Levels             =          2                 Number of obs      =       840
Dependent variable =       mode                 LR chi2(8)         =  194.9313
Log likelihood     = -193.65615                 Prob > chi2        =    0.0000

------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
travel       |
        aasc |   6.042255   1.198907     5.04   0.000     3.692441     8.39207
        tasc |   5.064679   .6620317     7.65   0.000     3.767121    6.362237
        basc |   4.096302   .6151582     6.66   0.000     2.890614     5.30199
          gc |  -.0315888   .0081566    -3.87   0.000    -.0475754   -.0156022
        ttme |  -.1126183   .0141293    -7.97   0.000    -.1403111   -.0849254
-------------+----------------------------------------------------------------
type         |
     hincair |   .0153337   .0093814     1.63   0.102    -.0030534    .0337209
-------------+----------------------------------------------------------------
(IV params)  |
             |
type         |
        /fly |   .5859993   .1406199     4.17   0.000     .3103894    .8616092
     /ground |   .3889488   .1236623     3.15   0.002     .1465753    .6313224
------------------------------------------------------------------------------
LR test of homoskedasticity (iv = 1): chi2(2)=   10.94    Prob > chi2 = 0.0042
------------------------------------------------------------------------------

The following is an attempt to create a LIML (limited information maximum likelihood) model corresponds to table 19.12, the LIML estimate. This is a two-step procedure. Stata currently does not support LIML yet. So we have to do some extra work here. The first step is to estimate the choice model for the three choices in the ground branch. Using variable type that we created before for nested logit tree, we create a new variable called method to indicate the branches. We then use clogit making use of the group structure.

tab type, gen(method)

       type |      Freq.     Percent        Cum.
------------+-----------------------------------
        fly |        210       25.00       25.00
     ground |        630       75.00      100.00
------------+-----------------------------------
      Total |        840      100.00
      
clogit mode tasc basc gc ttme if method2==1, group(grp)

note: 58 groups (174 obs) dropped due to all positive or
      all negative outcomes.
Iteration 0:   log likelihood = -151.84818
Iteration 1:   log likelihood = -93.828969
Iteration 2:   log likelihood = -88.408188
Iteration 3:   log likelihood = -87.943715
Iteration 4:   log likelihood = -87.938161
Iteration 5:   log likelihood =  -87.93816

Conditional (fixed-effects) logistic regression   Number of obs   =        456
                                                  LR chi2(4)      =     158.10
                                                  Prob > chi2     =     0.0000
Log likelihood =  -87.93816                       Pseudo R2       =     0.4734

------------------------------------------------------------------------------
        mode |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        tasc |   4.463668   .6405338     6.97   0.000     3.208245    5.719091
        basc |   3.104744   .6090192     5.10   0.000     1.911088      4.2984
          gc |  -.0636819   .0100424    -6.34   0.000    -.0833646   -.0439992
        ttme |  -.0698778   .0148803    -4.70   0.000    -.0990427    -.040713
------------------------------------------------------------------------------

So we have seen that we have correctly estimated the coefficient for variables tasc, basc, gc and ttme. We now try to compute the inclusive values using the formula at the bottom of page 866.

matrix X=e(b)
matrix V1=e(V)
gen myivb=X[1,1]*tasc + X[1,2]*basc + X[1,3]*gc + X[1,4]*ttme
replace myivb = log(exp(myivb)+exp(myivb[_n+1])+exp(myivb[_n+2])) if mod(_n,4)==2
(210 real changes made)

replace myivb = myivb[_n-1] if mod(_n,4)==3 | mod(_n, 4)==0
(420 real changes made)

gen ivair=aasc*myivb
gen ivground=(1-aasc)*myivb

So far, we have the variables IV_fly and IV_ground computed correctly. Next step is to compute the last model. Since we now look at a higher level we need to reshape our dataset. We only have two choices, ground versus fly. The variable branch is created corresponding to the two choices. We then have to drop half of the observations since we only have two choices for each person instead of four. clogit is used again to obtain the parameter estimate for branch level variables. You can see that the coefficient estimates are correct but not the standard error. This is because LIML uses different method of estimating the standard errors which involves some matrix computation.

preserve
gen branch=mode*type
replace branch = branch[_n+1] | branch[_n+2] | branch[_n] if mod(_n, 4)==2
(152 real changes made)

drop if mod(_n, 4)==3 | mod(_n, 4)==0
(420 observations deleted)

drop grp
gen grp=group(210)
clogit branch aasc hincair ivair ivground, group(grp)

Iteration 0:   log likelihood = -127.30482
Iteration 1:   log likelihood = -115.71699
Iteration 2:   log likelihood = -115.33683
Iteration 3:   log likelihood = -115.33543
Iteration 4:   log likelihood = -115.33543

Conditional (fixed-effects) logistic regression   Number of obs   =        420
                                                  LR chi2(4)      =      60.45
                                                  Prob > chi2     =     0.0000
Log likelihood = -115.33543                       Pseudo R2       =     0.2076

------------------------------------------------------------------------------
      branch |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        aasc |  -.0647031   .9836824    -0.07   0.948    -1.992685    1.863279
     hincair |   .0207877   .0085162     2.44   0.015     .0040962    .0374791
       ivair |   .2266246   .1005949     2.25   0.024     .0294622    .4237869
    ivground |    .158721   .0717853     2.21   0.027     .0180245    .2994175
------------------------------------------------------------------------------

restore

Example 19.22 on Poisson regression. We’ll skip here the measure of goodness-of-fit based on the standardized residuals and the measure based on the deviance.

use "https://stats.idre.ucla.edu/stat/stata/examples/greene/TBL19-3.DTA", clear

gen logm=log(months)
(6 missing values generated)

constraint define 1 logm=1
poisson acc tb tc td te t6569 t7074 t7579 o7579 logm, const(1)

Iteration 0:   log likelihood = -119.39808  
Iteration 1:   log likelihood = -81.620956  
Iteration 2:   log likelihood = -68.436229  
Iteration 3:   log likelihood = -68.414556  
Iteration 4:   log likelihood = -68.414554  

Poisson regression                                Number of obs   =         34
                                                  LR chi2(9)      =     575.58
                                                  Prob > chi2     =     0.0000
Log likelihood = -68.414554                       Pseudo R2       =     0.8079

------------------------------------------------------------------------------
         acc |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          tb |  -.5447114   .1776135    -3.07   0.002    -.8928274   -.1965954
          tc |  -.6887643   .3290358    -2.09   0.036    -1.333663    -.043866
          td |  -.0743091   .2905578    -0.26   0.798    -.6437919    .4951737
          te |    .320529    .235752     1.36   0.174    -.1415365    .7825945
       t6569 |   .6958455   .1496562     4.65   0.000     .4025246    .9891663
       t7074 |   .8174555   .1698376     4.81   0.000     .4845798    1.150331
       t7579 |   .4449709   .2332392     1.91   0.056    -.0121695    .9021113
       o7579 |   .3838589   .1182605     3.25   0.001     .1520726    .6156451
        logm |          1          .        .       .            .           .
       _cons |  -6.402877   .2175228   -29.44   0.000    -6.829214    -5.97654
------------------------------------------------------------------------------

lrtest, saving(0)

* Stata 8 code.
poisgof

* Stata 9 code and output.
estat gof 

         Goodness-of-fit chi2  =  38.96253
         Prob > chi2(24)       =    0.0276
         
poisson  acc t6569 t7074 t7579 o7579  logm, const(1)

Iteration 0:   log likelihood = -1166.5368  
Iteration 1:   log likelihood = -89.701554  
Iteration 2:   log likelihood = -80.215736  
Iteration 3:   log likelihood = -80.201227  
Iteration 4:   log likelihood = -80.201226  

Poisson regression                                Number of obs   =         34
                                                  LR chi2(5)      =     552.00
                                                  Prob > chi2     =     0.0000
Log likelihood = -80.201226                       Pseudo R2       =     0.7748

------------------------------------------------------------------------------
         acc |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       t6569 |   .7536172   .1487663     5.07   0.000     .4620406    1.045194
       t7074 |   1.050336   .1575621     6.67   0.000     .7415201    1.359152
       t7579 |   .6998992   .2203023     3.18   0.001     .2681146    1.131684
       o7579 |   .3872451   .1181021     3.28   0.001     .1557692     .618721
        logm |          1          .        .       .            .           .
       _cons |  -6.946953   .1269425   -54.73   0.000    -7.195756    -6.69815
------------------------------------------------------------------------------

lrtest, using(0)

Poisson:  likelihood-ratio test                       chi2(4)     =      23.57
                                                      Prob > chi2 =     0.0001
                                                      
* Stata 8 code.
poisgof

* Stata 9 code and output.
estat gof

         Goodness-of-fit chi2  =  62.53587
         Prob > chi2(28)       =    0.0002
         
poisson  acc  tb tc td te  o7579  logm, const(1)

Iteration 0:   log likelihood = -1135.6024  
Iteration 1:   log likelihood =  -92.22882  
Iteration 2:   log likelihood = -84.118981  
Iteration 3:   log likelihood = -84.115144  
Iteration 4:   log likelihood = -84.115144  

Poisson regression                                Number of obs   =         34
                                                  LR chi2(6)      =     544.18
                                                  Prob > chi2     =     0.0000
Log likelihood = -84.115144                       Pseudo R2       =     0.7639

------------------------------------------------------------------------------
         acc |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          tb |  -.7437271   .1691475    -4.40   0.000     -1.07525    -.412204
          tc |  -.7548676   .3276393    -2.30   0.021    -1.397029   -.1127063
          td |  -.1843232   .2875527    -0.64   0.522    -.7479161    .3792697
          te |   .3841932     .23479     1.64   0.102    -.0759868    .8443732
       o7579 |   .5000988   .1115645     4.48   0.000     .2814363    .7187612
        logm |          1          .        .       .            .           .
       _cons |  -5.799973   .1784196   -32.51   0.000    -6.149669   -5.450278
------------------------------------------------------------------------------

lrtest, using(0)

Poisson:  likelihood-ratio test                       chi2(3)     =      31.40
                                                      Prob > chi2 =     0.0000
                                                      
* Stata 8 code.
poisgof

* Stata 9 code and output.
estat gof

         Goodness-of-fit chi2  =  70.36371
         Prob > chi2(27)       =    0.0000