I have a RNA-seq count data which I suspect to have a Poisson distribution except that the right tail is quit 'heavy'. Does anyone know some heavy-tailed distribution? It would be better if it also has implementations in R. Thanks so much!
Common heavy-tailed distributions[edit]All commonly used heavy-tailed distributions are subexponential.
Those that are one-tailed include:
the Pareto distribution;
the Log-normal distribution;
the Lévy distribution;
the Weibull distribution with shape parameter less than 1;
the Burr distribution;
the log-gamma distribution;
the log-Cauchy distribution, sometimes described as having a "super-heavy tail" because it exhibits logarithmic decay producing a heavier tail than the Pareto distribution.
Those that are two-tailed include:
The Cauchy distribution, itself a special case of both the stable distribution and the t-distribution;
The family of stable distributions, excepting the special case of the normal distribution within that family. Some stable distributions are one-sided (or supported by a half-line), see e.g. Lévy distribution. See also financial models with long-tailed distributions and volatility clustering.
I think it should be a log-normal. So just log it and throw a normality test at it. I know there's still a lot of debate about how legitimate a log transformation really is, but i think it is fine.
RNAseq data generally follow a log-normal poisson distribution. I can send you R and C++ code to evaluate and fit that distribution if you are interested. The negative binomial is easier to handle but the tails of the negative binomial are not as heavy as the ones found in real data. email [email protected]