Negative binomial panel count data model - can anyone help?

The negative binomial distribution is appropriate to model count data in the case of overdispersion (the variance is greater than the mean). If the variance is equal to the mean the Poisson distribution would be more appropriate.

SPSS fits models for count data assuming a Negative Binomial distribution and a Logarithmic link function using Generalized Linear Models

Kofi Nkansah

Stata, Spss, Sas ..the ucla IDRE has examples and data sets to play with: ats.ucla.edu

Just search the topic and software you have currently and you good to go. The previous contribution by Liberato is the simplest explanation of nbreg you can get. Good luck

Brandy R Sinco

Other options are SAS procs GenMod and Glimmix. To use a GEE model, use Proc GenMod. For a generalized linear mixed model, use Proc Glimmix.

Example with Proc GenMod:

http://support.sas.com/documentation/cdl/en/statug/65328/HTML/default/viewer.htm#statug_genmod_examples09.htm

Example with Proc Glimmix

http://support.sas.com/documentation/cdl/en/statug/65328/HTML/default/viewer.htm#statug_glimmix_examples15.htm

Link to presentation on Proc Glimmix by statistics professor at Midwest SAS Users Group conference:

http://www.mwsug.org/proceedings/2013/AA/MWSUG-2013-AA11.pdf

Hope this helps.

Pedro Joel Mendes Rosa

Hi Sumati!

You can use negative binomial regression for modeling count variables, usually when they are over-dispersed . Assuming that you are using SPSS, I leave you here an example of a negative binomial regression with all the steps and interpretation. Hope it helps. Cheers.

Jeff Skinner

Arne, please note that the "gamlss" package library is for fitting "generalized ADDITIVE models" (GAM), not for fitting "generalized linear models". These GAM techniques are really for "data mining" applications, where you are investigating a very large number of predictor variables (e.g. 100s or 1000s of predictors) that might have complicated, but totally independent relationships with your dependent variable. These applications might include genomic data (microarray, next gen sequencing), biometric data, web traffic data, financial data, etc.

If you have a simple experiment with a single response and a small number of predictors (e.g. 1 to 10 predictors), then you should probably stick with regular generalized linear models. If you just want to use regular generalized linear models, then you should use the glm() function from the base "stats" package library. If you needed to test a few random predictors, then you could consider using using the glmer() function from the "lme4" package library.

Daniel Moya

I agree with Liberato, both for the negative binomial distribution aplication and the use of SPSS as a very usefull tool when operating big data bases.

Anıl Aktas Samur

Stata, SAS, R is suitable for Negative binomial data.

http://www.rutgerscps.org/docs/CountRegressionModels.pdf

http://www.ats.ucla.edu/stat/stata/dae/nbreg.htm

Livia Valentin

Dear Varma,

The models for count data have been prominent in many branches of the recent applied literature, for example, in health economics, management and industrial organization. The foundational building block in this modeling framework is the Poisson regression model. But, because of its implicit restriction on the distribution of observed counts – in the Poisson model, the variance of the random variable is constrained to equal the mean – researchers generally employ more general specifications such as the negative binomial (NB) model which is the standard choice for a basic count data model. There are two well known, non nested forms of the negative binomial model, denoted NB1 and NB2 in the literature. Researchers have typically chosen one form or the other (usually NB2), without actually articulating a preference for either. You need to choice what binomial better adequate for your research, all right?

The form for the negative binomial model and applied the techniques in an analysis of a large sample of population.

Since the NB1 and NB2 models are not nested, there is no simple parametric test that one can employ to choose between them. This appeal to the estimation algorithm appears to be the closest to a preference for one or the other as appears in the recent literature.

You can be the SPSS, STATA or Graphic Prism6 for calculating your data.

All the best and success in your research!...

Laura Stancampiano

I used STATA, but pay attention: stata fit both GLM and Negative binomial regression. Here the link to a former discussion about it: https://www.researchgate.net/post/STATA_GLM_and_negative_binomial_regression1

In negative binomial regression STATA estimates the parameter alpha, that is simply the inverse of the k parameter of negative binomial distribution, well known by parasitologists. The k parameter is inversely related to aggregation and can be estimated with the additional module nbfit with STATA. The K parameter requested by GLM is exactely the alpha parameter of negative bin regression (and it is equal to 1/k); its default value is 1, so if alpha is not 1, the results will be different.

Elena Gervilla

Hi Sumati!

I've used the poisson regression model (PRM) in two papers to measure the influence of risk factors on the number of joints and units of alcohol consumed a week. In the two articles most of the sub-samples analyzed did not comply with the assumption of equidispersion (Cameron and Trivedi, 1990), one of the basic assumptions of the PRM. So I run alternative models: Negative Binomial Regression Model (NBRM), Zero Inflated Poisson (ZIP) and Zero Inflated Negative Binomial (ZINB) and this last was the best adjusting to the data in all cases.

You can find the info I included in the paper:

https://www.researchgate.net/publication/49714137_Quantification_of_the_influence_of_friends_and_antisocial_behaviour_in_adolescent_consumption_of_cannabis_using_the_ZINB_model_and_data_mining?ev=prf_pub

I hope this helps you. Good luck!

Article Quantification of the influence of friends and antisocial be...

Sumati Varma

Thanks Elena !!

Sumati Varma

Thank you for all this input. Does anyone use E-Views for Negative Binomial ? Is there any preference between E-Views and Stata ?

Gudeta Weldesemayat Sileshi

For data with large number of zeros, ordinary negative binomial distribution (NBD) is inadequate as the data will be overdispersed. In that case a better approach is to use a zero-nflated negative binomial or a hurdle model. For panel data, a non-linear mixed effects model (NLMIXED) assuming Poisson or NBD is better than a generalized linear model (GLM), as the former allows one to include both fixed effects (e.g. covariates of interest) and random effects (e.g. within individual variation) and yet estimate the zero-inflation. I know of this only in NLMIXED of SAS. I am not familiar with an R code for that. I have included the SAS codes for NLMIXED in an earlier paper in Pedobiologia 52:1-17 (2008). Hope you can access it; due to copy right requirements I will not be able to upload it.

Aziz Kutlar

You can use STATA

Cem Payaslıoğlu

There are several good references : First there would be Greene, W (2007) article (more like a monograph) titled "Functional Form and Heterogeneity in Models for Count Data, Foundations and Trends in Econometrics, Vol. 1, No. 2 113–218 which also presents tabular output of several applications. This study specifically focus on heterogeneity issue plaguing microeconometric studies and page 196 features NegBin Panel output. There is also Hilbe's Treatise Negative Binomial Regession discusses extensively NB Panel in Chapter 14.

How to learn more about SPSS and its Application?

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

Baseline drift in HPLC? What causes this?

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

GC-MS retention index prediticon?

How are iso-frequency contours plotted?

How to prepare the nanoparticle treated fungal sample for Environmental SEM analysis?