* This is a comparison of how well the logit does relative to the probit when the data is generated from the assumptions underlying the probit.
* First let's generate data that is consistent with the probit assumptions
clear
set seed 101
set obs 1000
* x is the explanatory variable
gen x = rnormal()*(1/2)^(1/2)
* u is the error
gen u = rnormal()*(1/2)^(1/2)
* y is the unobservable structural y
gen y = x + u
* In order to do a probit correctly the underlying distribution has to be standard normal (which is not a restriction so long as you remember when generating values.)
* This is why rnormal*(1/2)^(1/2) -> var(y)=var(x+u)=(1/2)*1+(1/2)*1=1
sum y
* Pretty close to 1 in the sample
* y_prob is the probability of observing a 1 given
gen y_prob = normal(y)
* y is the actual binary draws
gen y_observed = rbinomial(1,y_prob)
* now let's try estimating probit first
probit y_observed x
* Save the estimated coefficient to a local macro
local coef_probit = _b[x]
* let's predict the probabilities
predict y_probit
label var y_probit "Probit fitted values"
* Now let's estimate the logit
logit y_observed x
* Save the logit to a local macro
local coef_logit = _b[x]
* predict the probabilities from the logit
predict y_logit
label var y_logit "Logit fitted values"
* We can see that both the probit and the logit are almost identical
two (line y_logit x, sort) (line y_probit x, sort)
di "It is a somewhat well known property that probits and logits are in practice almost linearly equivalent."
di "The ratio of probit to logit is: `=`coef_probit'/`coef_logit''"
reg y_probit y_logit
* Check out that R-squared!
* So what does all of this practically mean?
* Feel free to switch between probit and logit whenever you want. The choice should not generally significantly affect your estimates.
* Note: for mathematical reasons sometimes it is easier using one over the other.
* Finally, if you want to recover the original coefficient on x the best thing to do is to take the average partial effect (APE)
probit y_observed x
di (1/2)^(1/2)
test x==.70710678
* The probit results get fairly close but we reject the null
No comments:
Post a Comment