Geometric Random Variable: It can be shown that a Geometric random variable can be simulated using the following argument (int(ln(u)/ln(1-p)) + 1) where u is a uniform(0,1) random variable and p is the probability of observing a success (Simulation by Ross, 2003). In this example we are going to generate a Geometric random variable with 1000 observations with probability of success p = 0.25.
clear set obs 1000 local p = .25 gen u = uniform( ) gen g = int(ln(u)/ln(1-`p')) + 1
Negative Binomial Random Variable as a sum of independent Geometric Random Variables: To generate a Negative Binomial random variable we make use of the fact that a Negative Binomial random variable is sum of r independent Geometric random variables, where r is the of trials required to observe the rth success and p is the probability of a success. In this example, we are going to simulate p = 0.25 and let r = 4. Note that unlike in the geometric case, the uniform(0,1) is defined within the foreach argument so that the r trials are independent.
clear set obs 1000 local r = 4 local p = .25 foreach var of newlist nb1-nb`r' { gen `var' = int(ln(uniform( ))/ln(1-`p')) + 1 } egen nb = rsum(nb1-nb`r') drop nb1-nb`r'
Poisson and Zero-Inflated Poisson Random Variable: We can simulate a Poisson and Zero-Inflated Poisson random variable by recoding a uniform(0,1) random variable in terms of the cumulative distribution. What allows us to simulate both types of variables in the same code fragment is when we specify p0–the percent of zeros not explained by a Poisson distribution–to be 0, the Zero-Inflated Poisson reduces to the traditional Poisson distribution. The while loop continues until the cumulative probability exceeds 0.99999 (a problem may exist if a value from the uniform random variable is greater than 0.99999). What is required for this code fragment is the specification of the Poisson parameter lambda and p0.
We need to note that the program was stable for values of lambda less than 100. We recommend checking that x was recoded for all values before proceeding.
clear set obs 1000 gen x = uniform( ) gen u = x local lambda = 10 local p0 = .40 local cp=0 local n =0 while `cp'<=.99999 { local p = `p0'*(`n'==0) + (1-`p0')*exp(-1*`lambda')*(`lambda'^`n')/exp(lnfactorial(`n')) local cp= `cp'+`p' local pcp = `cp'-`p' quietly recode x (`pcp'/`cp' = `n') local n = `n' + 1 }
Negative Binomial and Zero-Inflated Negative Binomial Random Variable that allows for over dispersion. The hallmark of the Poisson distribution is that the mean is equal to the variance. Many times that assumption is not satisfied and the variance is greater than the mean. To account for the over-dispersion, we can use a Negative Binomial as a Gamma mixture of Poisson random variable that accounts for over-dispersion by adding a parameter alpha. Like the prior code fragment, we can simulate both types of variables within the same code fragment (the prior code was more stable when we specified extreme values of lambda and p0, this one is not. We recommend using conservative values of the parameter and checking that x was recoded for all values). When alpha is equal to zero, we end up with a Poisson random variable (this program blows up if you set alpha = 0, instead we recommend setting alpha =0.01). The code fragment requires the specification of the mean, p0 and alpha. The density formula of the negative binomial distribution was taken from Regression Analysis of Count Data by Cameron and Trivedi, formula (4.14).
clear set obs 1000 gen x = uniform( ) gen u = x local mean = 20 local alpha = .1 local p0 = .20 local a = 1/`alpha' local cp=0 local n =0 while `cp'<.99999 { local p = `p0'*(`n'==0) + (1-`p0')*(exp(lngamma(`n'+`a')) / (exp(lngamma(`a'))*(exp(lnfactorial(`n')))))*((`a'/(`a'+`mean'))^`a') /// *((`mean'/(`mean'+`a'))^`n') local cp= `cp'+`p' local pcp = `cp'-`p' quietly recode x (`pcp'/`cp' = `n') local n = `n' + 1 }