------------------------------------------------------------------------------- help for simregress -------------------------------------------------------------------------------Regression Simulation
simregress, [ n(#) dist(distribution type) p(#) df(#) hetmult(#) corrx(zero|random|matrix name) numxs(1 2 or 3) robust graph reps(#) ]
Description
simregress is for simulating regression results based on diferent assumptions about the distribution of the residualts.
The regression model is assumed to be of a form
Y = B0 + B1*x1 + B2*x2 + B12*x1x2 + B3*x3 + e
Where x1, x2, and x3 are simulated normal variables drawn from a multivariate normal distribution based on the correlation specified in corrx() and the residuals (e) are distributed as specified in dist() (and optionally df(), p(), and hetmult().
Options
n(#) permits you to specify the sample size for each iteration.
dist(distribution type) permits you to specify the distribution of the residuals. You can choose chi2, binomal, exponential, expnormal, log, lognormal, expnormal, uniform, bimodal, or normal.
df() - If you choose chi2 then you can also specify df() for the number of degrees of freedom (the default is 1).
p() - If you choose binomial then you can specify p() which specifies the probability of the residual being a 1 (as compared to being a 0), and the default value is .5
hetmult(#) permits you to specify a multiplier that determines the degree of heterogeneity. A value of 0 means no heterogeneity. Higher values indicate more heterogeneity.
corrx(zero|random|matrix name) permits you to specify the correlation among the predictors (x's), the default is zero. You can specify zero to indicate no correlation, or random to indicate a correlation structure made up at random. You can provide the name of a matrix that contains a correlation matrix with the correlations of the X values. For example.
matrix mycorr = (1, .1 , .2 .1, 1, .3 .2, .3, 1)
and then specify dist(mycorr). This must be a 3 by 3 matrix.
numxs(1 2 or 3) allows you to specify the number of predictors in the model, either 1, 2 or 3.
robust indicates that the regression should be run with robust standard errors using the robust option.
graph will perform a separate run with 50,000 observations and then display the distribution of the residuals, the residuals vs. the fitted plots, and a table of the residuals conditioned on six categories of X1.
reps(#) specifies the number of repetitions to perform. A typical value might be about 1000 or 3000 repetitions. The more repetitions, the smaller the confidence intervals.
Examples
* Sample 30 with normal distribution simregress , n(30) dist(normal) reps(2000)
* Sample 15 with chi square with 2 df distribution simregress , n(15) dist(chi2) df(2) reps(2000)
* Sample 30 with normal distribution and random correlation simregress , n(30) dist(normal) reps(2000) corrx(random)
* Sample 30 with normal distribution and specify correlation of Xs matrix mycorr = (1, .1 , .2 .1, 1, .3 .2, .3, 1) simregress , n(30) dist(normal) reps(2000) corrx(mycorr)
* Sample 30 with uniform distribution but heterogeneity factor of 3 simregress , n(30) dist(uniform) hetmult(3) reps(2000)
* Sample 30 with normal distribution and 1 X simregress , n(30) dist(normal) numxs(1) reps(2000)
* Sample 30 with normal distribution * Also, do a separate run with N=50,000 and graph the residuals simregress , n(30) dist(normal) graph reps(2000)
* Sample 30 with normal distribution and use robust standard errors simregress , n(30) dist(normal) robust reps(2000)
Author ------ Michael Mitchell Academic Technology Services UCLA mnmatucla.edu