Hello all,
I have sequenced results of TnSeq library of Agrobacterium tumefaciens grown at two different growth conditions. My overall goal is to see which genes are important/beneficial for the survival at two different conditions. To identify that, I want to look at the under-or over-representation of transposon insertion number or transposon insertion density per gene and compare the diversity of transposons per gene between the two conditions. The genes with the lowest transposon density would be considered as beneficial for that condition (assuming transposon insertion disrupted the gene's function which was important for survival in that condition).
I have both transposon insertion (density/gene) and frequency of particular insertion (read count) in a gene data in an excel file.
Firstly, I am planning to use Shannon diversity index to look if there is any transposon diversity difference between conditions. Since this diversity considers both richness (number of different inserts) and evenness (frequency of particular insert) per community, I decided on calculating the diversity using this index. However, I am not sure if it tells us specifically where the dissimilarities are? Also, is Bray Curtis similarity index helpful in this kind of situation?
Once I calculate the diversity index, I want to look for statistical significance of the diversity difference. I have heard about ANOSIM, ADONIS and PERMANOVA but I am not sure if these statistical models would be helpful in this case. Could anyone please clarify on this?
I hope my questions are clear enough, but if not please let me know!
Thanks in advance for your help.