I am interested in using an offset term as the first fixed term in my linear regression models. The reason I am doing this is because I am interested in specific fungal taxa (i.e. plant pathogens) and rarefying my data removed a large proportion of these (likely) biologically meaningful OTUs, particuarly in context to my research theme.
I base the offset term on library size calculated from OTU read counts from NGS data.
Is anyone familiar with this, and more specifically, is anyone aware of any problems associated with this aprroach?
I have a simple model with a single predictor with two levels. When the graphs obviously show the OTU richness is distinct between the the two levels, the results are infact saying this in significant. And vice versa! To be more specific, I am running negative binomial GLMs to account for the overdispersion of count data, the residue plots indicates this is a good fit.
Any light that can be shed on this will be greatly appreciated.
Thank you in advance!