Empirically that distributions are too close to separate - the choices is usually one of convenience - the logit because it can be transformed to odds; the probit for its latent normal interpretation- the latter allows things like relatively simple MCMC implementation
There are a number of different link functions that are used in discrete outcome Binomial/Bernoulli/Multinomial models for transforming the underlying probability of an outcome:
the logit: often favoured in epidemiology because it is readily transformed to the odds ratio; so that we can say things like those who smoke >25 cigarettes a day are 6 times more likely to die before 65 years of age than non-smokers; it is also more readily extended to the multinomial case when there could be ordered and un-ordered categorical outcomes;
the probit: has widespread use in econometrics and we can think of it as an underlying normal variable (utility) which when a threshold is crossed turns into a discrete outcome; in more complex models (eg multilevel models) this conceptualisation (underlying normal trick) is very useful for fitting models by Monte Carlo Markov Chain estimation because you can simulate easily from this underlying normal variable ;
The CLogLog transformation is widely used to model group survival data and with such data it is possible to estimate hazard ratios or relative risks.
The figure (see attachment) plots all three link functions against corresponding proportions/probabailities. All three are linear when the proportions on which they are based lie between 0.2 and 0.8 and non-linear outside this range. The logit and probit are symmetric around the proportion of 0.5, where t both logit and probit are 0. The probit is based on the standard Normal distribution while the logistic is based on the standard logistic distribution. While both distributions have a mean of zero they have quite different variances; the Normal has a variance of 1 and the logistic a variance of (3.141592^2)/3 . This results in the logits being larger than their probit counterparts. On this basis it has been suggested that to convert from probit to a logit, the probit is multiplied by 1.61, and for conversion from a logit to a probit, multiply by 0.625.
[The variance scales by 1.612^2, so the coefficients themselves scale by 1.61; see Amemiya, T. (1981) Quantitative response models: a survey, Journal of Economic Literature 19, 1483-1536.]
The Cloglog is slightly different from the others. It is based on a standard extreme value distribution and as the figure shows, the curve approaches one more gradually than it approaches zero. The alternative of a 'slower' approach to zero can be achieved by re-defining the response so as to apply the model to (say) unemployment rather than employment.
While the distributions differ in their theoretical rationale, they can for all practical purposes in my view be treated the same as it would take an impossibly large sample to distinguish them empirically.
Graph attached
LogitEtc.docx
Probit and logit model? - ResearchGate. Available from: https://www.researchgate.net/post/Probit_and_logit_model2 [accessed Jul 29, 2016].
Empirically that distributions are too close to separate - the choices is usually one of convenience - the logit because it can be transformed to odds; the probit for its latent normal interpretation- the latter allows things like relatively simple MCMC implementation
There are a number of different link functions that are used in discrete outcome Binomial/Bernoulli/Multinomial models for transforming the underlying probability of an outcome:
the logit: often favoured in epidemiology because it is readily transformed to the odds ratio; so that we can say things like those who smoke >25 cigarettes a day are 6 times more likely to die before 65 years of age than non-smokers; it is also more readily extended to the multinomial case when there could be ordered and un-ordered categorical outcomes;
the probit: has widespread use in econometrics and we can think of it as an underlying normal variable (utility) which when a threshold is crossed turns into a discrete outcome; in more complex models (eg multilevel models) this conceptualisation (underlying normal trick) is very useful for fitting models by Monte Carlo Markov Chain estimation because you can simulate easily from this underlying normal variable ;
The CLogLog transformation is widely used to model group survival data and with such data it is possible to estimate hazard ratios or relative risks.
The figure (see attachment) plots all three link functions against corresponding proportions/probabailities. All three are linear when the proportions on which they are based lie between 0.2 and 0.8 and non-linear outside this range. The logit and probit are symmetric around the proportion of 0.5, where t both logit and probit are 0. The probit is based on the standard Normal distribution while the logistic is based on the standard logistic distribution. While both distributions have a mean of zero they have quite different variances; the Normal has a variance of 1 and the logistic a variance of (3.141592^2)/3 . This results in the logits being larger than their probit counterparts. On this basis it has been suggested that to convert from probit to a logit, the probit is multiplied by 1.61, and for conversion from a logit to a probit, multiply by 0.625.
[The variance scales by 1.612^2, so the coefficients themselves scale by 1.61; see Amemiya, T. (1981) Quantitative response models: a survey, Journal of Economic Literature 19, 1483-1536.]
The Cloglog is slightly different from the others. It is based on a standard extreme value distribution and as the figure shows, the curve approaches one more gradually than it approaches zero. The alternative of a 'slower' approach to zero can be achieved by re-defining the response so as to apply the model to (say) unemployment rather than employment.
While the distributions differ in their theoretical rationale, they can for all practical purposes in my view be treated the same as it would take an impossibly large sample to distinguish them empirically.
Graph attached
LogitEtc.docx
Probit and logit model? - ResearchGate. Available from: https://www.researchgate.net/post/Probit_and_logit_model2 [accessed Jul 29, 2016].
I have some knowledge that all these models can be interpreted as latent variable models with different error distributions. And selection is to be done according to this a priori knowledge about errors.
Another choice might be deviance. But latent variable modeling seems more meaningful to me.
In my view the logit and probit models are empirically indistinguishable - you can also convert mathematically from one to the other so it just convenience.