Microbial ecologists often apply diversity indices from the macro world (eg., animals or plants) to the microbial world. If you get certain numbers from calculations, what do those indices really mean for the microbial communities?
Hello Daniel, These indices are certainly used for measuring diversity.
As we know there are basically three types of Diversity: alpha, beta and Gamma.. (terms used when we quantify the diversity; other wise we simply use the terms species diversity, ecosystem diversity, genetic diversity etc.) Now, the diversity indices you have mentioned are used to quantify the diversity of one distinct population at a particular area at a particular point of time. in this case, Shannon Wiener and Simpson/ Simpson Yule index can be used in combination with McIntosh index.
Using these, you will be able to compare the numbers that indicate the diversity of two distinct populations i.e. you may compare the diversity between the two (or more) population/s.
More over when you are using shannon wieners index, the higher number indicates higher diversity. but in the case of simpson yule, there are three different expressions i.e. D, 1-D and 1/D. When you are solely using the value of "D", the higher number indicates lower diversity which logically may not sound right. so two more expressions of the same index were developed i.e. 1-D, 1/D.
Hello Daniel, These indices are certainly used for measuring diversity.
As we know there are basically three types of Diversity: alpha, beta and Gamma.. (terms used when we quantify the diversity; other wise we simply use the terms species diversity, ecosystem diversity, genetic diversity etc.) Now, the diversity indices you have mentioned are used to quantify the diversity of one distinct population at a particular area at a particular point of time. in this case, Shannon Wiener and Simpson/ Simpson Yule index can be used in combination with McIntosh index.
Using these, you will be able to compare the numbers that indicate the diversity of two distinct populations i.e. you may compare the diversity between the two (or more) population/s.
More over when you are using shannon wieners index, the higher number indicates higher diversity. but in the case of simpson yule, there are three different expressions i.e. D, 1-D and 1/D. When you are solely using the value of "D", the higher number indicates lower diversity which logically may not sound right. so two more expressions of the same index were developed i.e. 1-D, 1/D.
Diversity indices can be applied to anything - living or not - so there's no worry about applying them to microorganisms.
But they're affected by two components - the number of different entities (species, genes, or whatever else) and the variance between the numbers of individuals in each of these entities. Different indices are affected to different degrees by these two components. Species richness, for example, is obviously affected only by the number of species. Simpson's, however, is relatively unaffected by number of species and so highlights differences between the numbers of individuals of each species. There are very many diversity indices. Which of these you choose depends on which of the two components you want to highlight. The key reference for such indices of course Magurran's book on measuring diversity.
Please note that the Shannon index is usually wrongly used. It is valid only for complete counts and not for samples. See Pielou's book Mathematical Ecology for the reasons behind this.
They are summary statistics, well described by Andrish, but of limited value compared to multivariate approaches. Primer6 added Taxonomic Distinctness, similar to what is often called Genetic Diversity by molecualr biologists, both indices that incorporate the path lengths of relatedness in pairwise comparisons for phylogenetic tree structures, whether Linnaen or sequence based. Even these, which contain much more information than Shannon/Simpson et al., are limited summary statistics that leave behind a great deal of information. There are equations for comparing alpha, beta and gamma diversity as described by Manthan, a somewhat recent science paper addressed this for plant communities, could be found with a quick search I suspect.
Number of species alone does not tell as much as the diversity index about the diversity of an ecosystem. In macro ecosystems a forest with 1 bear, 1 wolf, 1 fox and 100 rabbits contains less diverse ecosystem than a forest with 1 bear, 2 wolfs, 5 foxes and 50 rabbits. This is due to a different relative abundance of each of the present species in the two ecosystems. It is very similar in the microbial world, only when taking into account the relative abundances of all present species we can measure the diversity of an ecosystem. Higher reciprocal Simpson index means higher diversity. It is a bit confusing with the Shanon index because different authors use different basis of the logarithm that figures in the equation (in the original theory the basis was defined as 2), and results from different studies cannot be directly compared, but as long as the same bases is used in the equation, again, higher index corresponds to the higher diversity of an ecosystem.
To me the question is little bit unclear. Diversity indices are ecologically relevant and most were developed in that purpose as well, except some originated from information sciences (e.g., Shannon index). However, as discussed above diversity indices are not perfect as each more focus on one or the other aspects of diversity - diverse diversity indices may indicate it as well. Diversity indices may be relevant in the context being a part of "ecological" question, such as well-known diversity-ecosystem function. For the diversity of community itself, more informative approach should be pursued in addition, such as species abundance distribution. HTH. SK.
As was suggested by some of the others, there is not necessarily any "ecological meaning" to these diversity indices. These indices can be calculated for macro-organisms, micro-organisms or even inanimate objects like fasteners in bins in a hardware shop. They simply are a summary measure of how many different things are there and reflect the relative abundance of these (since the relative proportion of each "type" of "thing" in the community is the key factor in the calculation of each of these). To back up further, as pointed out by Dr. Rajilic-Stojanovic above, if the proper from the index is used, a higher value means higher "diversity". The relationship between diversity of a community (no talking about microbial communities to get back to your question) and a given function is not necessarily straightforward. Some would like to think that they are directly related, but since these indices are not generally calculated based on any functional variable (for example, 16S rRNA-encoding gene sequence can't be directly assigned to an ecosystem function with any certainty), this is proposed relationship is necessarily true. In summary, the relationship between something like Shannon or Simpson and an ecologic "meaning" isn't quite straightforward.
I agree with most of what has been said. Just want to highlight one thing. There are different thing beneath Shannon index, one of them is species richness, as said before, and another is the distribution of frequency of these species. For instances a community of 100 animals is more diverse (higher Shannon value) if it has 10 species with 10 animals each than if it has 10 species one with 55 individuals and the other 9 with 5 each. If you want to know how different factors are affecting your indexes, it is a good idea to calculate species richness and probability of interespecific account in each sample.
Although Shannon index assumes complete counts you can solve that by resampling. Ecosism (http://garyentsminger.com/ecosim/) does it for you applying rarefaction. You can use the same software to estimate species richness and other indexes in a sound way. It has some very useful examples that will help you use the software and grasp the concepts.
The only additional thing I would add is that, although these measures do have a long history of use for understanding community structure in macro-organisms, their application to bacteria (especially when applied to data generated using 16S based sequencing data) is really just based on convention (or dare I say, habit). The ecological implications (or meaning), and our ability to interpret changes in these measures, is a big unknown at this point. There is a lot of work to be done in the development of meaningful and useful measures of the structural diversity of microbial assemblages. When something as basic as the species concept begins to break down... how useful is counting the number of species? Exciting times!
The 2 statistics measured by diversity indices are richness and evenness. Richness basically measures how many different species are found within a sample or habitat, while evenness asks about the distribution of abundances of discrete species. For example, if you compare population A(a1,2;a2,2;a3,2;a4,2) (where a1..a4 are different species, each with 2 indivduals) with population B(b1,4;b2,2;b3,1;b4,1), then the richness in both is the same but the evenness in A is greater than in B. By definition, greater evenness means higher diversity so that in our case population A is considered more diverse than populationB.
What most diversity indices do is to assign a certain weight to these 2 statistics and return an estimate of diversity.
For microbial ecology, particularly for sequence based surveys, you can only meaningfully compare samples obtained by the exact same methodology. The exact number you get is less important than how it compares to other samples. Naturally species in all samples must be called based on the same parameters, particularly % similarity used for OTUs. The assembly of the sequence collection (i.e. how rigorous was the screening of bad sequences, chimeras, sequencing errors etc) is of particular importance as singletons (sequences appearing only once in a sample) cause artificial inflation of the richness. This is particularly problematic when the samples have different resistance to PCR resulting in different noise levels. Finally, when samples of different size are compared, you must make sure to compare percent of sample rather than actual number, and again to control for singletons (not easy). If you have enough sequences, as is often the case now with next gen tag-sequencing, it is better to generate random sub-samples from each sample and compare average indices for these.
In a simple way, the indices indicate how diverse different organisms live in certain habitat are. Higher index number shows more diverse the organisms in that habitat. Shannon or simpson indices also measure evenness that calculate how equal the number each organisms in that habitat. The highest evvenness number is 1 indicating the number of each organisms in the habitat relatively the same.
Those indices by itself might not tell how diverse a community really is. By comparison indices, it is also hard to know the real difference of diversities among two or more communities. However, if we convert the diversity indices into the effective number of species (ENS), it will be much easier and more straightforward to tell the biological diversity in a community. The effective number of species (ENS) is the real biological diversity rather than an index. After converting many different diversity indices into the ENS, it is easier to compare the real difference of biodiversity among different communities.
Hello there! I think the ecological meaning and conclusions we can derive from diversity analysis applied to bacteria assemblages could be quite constrained since at the molecular level, bacteria not all the time satisfy the species definition.
For bacterial communities, people often use operational taxonomic unit (OTU) to classify bacteria into different species/groups. OTU usually depends on the similarity of 16S rRNA gene. The problems behind include: what is the threshold similarity (eg. 97%)? how well does OTU correlate with bacterial functions/activities? How does diversity affect bacterial functions in an ecosystem?
To put it very simple, Shannon's index shows information: high H = more information that is more diversity. But thewre are no coefficients to use and calculate, e.g., to how many genes or species this refers.
Simpson's index simply shows the probability that tow randomly drawn sequences from the pool are the same.
The Shannnon's index is just a single estimate point (as well as species counts or the Simpson's index). The Shannon's index in particular is mixing up the two fundamental aspects of structural diversity, i.e. richness and evenness, and it does not say much by itself with its absolute numbers. What is important for comparing communities in time and space is, instead, the summary of diversity indices through the diversity spectrum profile to look at how the mixture of diversity components is changing in going from richness to evenness. In this respect, the diversity index family A, (Patil & Taillie, 1976, 1979), easy to calculate and to whom Shannon's and Simpon's indices belong, is likely to be preferred for its desirable properties, in particular for the possibility of drawing diversity spectra profiles which allow a continuum of analysis between the two extremes of species richness (species count) and evenness (Simpson's index). The diversity index family has been proposed at the end of 70'; do not be afraid to use it, even if it is old it does not mean it is rubbish! This structural diversity spectrum is an important property of each community reflectimg its complex dynamics, and that, unfortunately, has been neglected in biodiversity research. This could be, instead, the basis to better understand what is behind the observed changes in structural diversity, i.e. their ecological meaning, no matter which community is being considered.
The ideal answer is that if you read Magurran, A.E .1988 publications you can understand at least briefly. alpha beta and gamma diversity may be studied
There are two components to the original question.
1) What do the diversity measures mean in a conceptual sense? In other words, what phenomenon are you quantifying when calculating e.g. the Shannon index?
2) What is the ecological meaning of the values of the diversity measures in a particular case? In other words, given the sampling setup and other characteristics of the obtained dataset, what of ecological interest can be learned from the resulting numbers?
Answer to the second question can only be given when the exact details of a dataset are known, but some general principles are worth keeping in mind. Firstly, all diversity measures are based on classifying observed entities (such as units of biomass, or individual animals or microbes) into non-overlapping types (such as species, higher taxa, haplotypes or functional types). Secondly, all diversity indices are calculated using equations that include the proportional abundances of the types (abundance can be measured, for example, as the number of individuals, cover, volume, or biomass). For the diversity measures to give useful results, it is therefore crucial that the classification that is being used is appropriate, that the abundances of the types can be recorded with sufficient accuracy and using a relevant measurement unit, and that the entities that are observed in the first place are chosen and observed in a way that is relevant. What is appropriate, sufficiently accurate and relevant depends on the questions at hand.
As to the conceptual meaning of the diversity measures, there is a lot of confusion out there, but personally I prefer the following interpretations. For simplicity, the descriptions are written in terms of species diversity, although the same principles apply with other classifications.
Diversity (D) = the effective number of species. This is also known as 'true diversity' and is calculated as the number of equally-abundant species that would give the same mean proportional species abundance as is observed in the dataset of interest (where species may not be equally abundant). D is quantified as the inverse of the weighted mean of the proportional species abundances. The proportional abundances themselves are used as weights, which gives abundant species more weight than rare ones. Ultimately the weight given to rare vs abundant species is determined by the order of the diversity, i.e. which mean is used when calculating mean proportional species abundance (harmonic mean = order zero, geometric mean = order 1, arithmetic mean = order 2, etc.).
Shannon index (H') = a measure of the uncertainty in the species identity of an individual (or other unit of abundance) that is picked at random from the dataset of interest. Increasing the number of species and/or the evenness of their proportional abundances increases the value of the Shannon index. The Shannon index quantifies an entropy, and it is monotonically related to true diversity (of order 1) by H' = log(D) = log(1/mean[p]) where mean[p] is the weighted geometric mean of the proportional species abundances.
Simpson index = several indices have been referred to by this name, and although all of them are monotonic transformations of true diversity, each actually means something different. The original Simpson index equals 1/D = sum(p*p) = mean(p) where mean(p) refers to the weighted arithmetic mean of the species proportional abundances. This index quantifies the probability that two individuals (or other units of abundance) that are picked at random from the dataset (with replacement) represent the same species. The Gini-Simpson index = 1 - 1/D quantifies the probability that the two randomly picked individuals represent different species. The inverse Simpson index equals D itself (of order 2).
As to the partitioning of diversity into alpha, beta and gamma diversity, these become relevant only when a dataset consists of (or gets divided into) subunits in such a way that each individual (or other unit of abundance) belongs to exactly one subunit (species can occur in more than one subunit).
Gamma diversity = total diversity of species (or other types) in the dataset of interest.
Alpha diversity = mean species diversity per subunit in the dataset of interest.
Beta diversity = diversity of subunits, i.e. the number of subunits that would be needed to contain the observed total species diversity, if the subunits shared no species but had the same species diversity as the actual subunits do on average.
As to literature, Magurran has updated the diversity book (Magurran 2004: Measuring biological diversity), but a lot has already been written about the different diversity measures since then.
What I explained above has been discussed more extensively in the following (including illustrations of the concepts and equations for their quantification):
Tuomisto, H. 2010. A consistent terminology for quantifying species diversity? Yes, it does exist. Oecologia 4: 853–860.
Tuomisto, H. 2011. Commentary: do we have a consistent terminology for species diversity? Yes, if we choose to use it. Oecologia 167: 903–911. (see also commentaries by Gorelick, Jurasinski & Koch and Moreno & Rodriguez in the same issue)
Numerous other papers (mostly more technical than the above) have also been written on diversity since Lou Jost revived the ecologists' interest in Hill numbers (which equal true diversity) in the following papers:
Jost, L. 2006. Entropy and diversity. Oikos 113: 363–375.
Jost, L. 2007. Partitioning diversity into independent alpha and beta components. Ecology 88: 2427–2439.
Sorry about the lengthy reply, but I hope it is of some help.
The effective number of species (ENS) is a real biological diversity (equivalent) and it also converts different indices (eg., Shannon or Simpson's) into one uniform number that makes biological sense and allows comparison of diversity. But for the microbial world where the concept of species is not very clear, does ENS still make sense ??
Diversity is an effective number of types -- and these types do not need to be species. Depending on the research questions, it may be more relevant to quantify the diversity of genera, families, OTU's, haplotypes or something else. These should not be considered as 'surrogates for species', because it is perfectly valid to quantify them in their own right. Things just get unnecessarily confusing if all diversity measures are interpreted in terms of species diversity, whether they really address species or not.
In practice, diversity is calculated for a dataset consisting of observation units (such as individuals, microbial cells or bacterial colonies) that have been classified into types. In my opinion, the effective number of types makes sense if the following conditions hold:
1) The dataset has been compiled in a way that makes sense.
2) The classification into types makes sense.
3) The abundances of the types (e.g., number of individuals, coverage of colonies) have been quantified in a way that makes sense
What makes sense is context-specific and depends above all on the objectives of the study.
Diversity indices compound two aspects of assemblage composition - one is species richness (ie the number of species) and the other is species evenness (are the individuals evenly spread amongst the species) - using separated measures of these two aspects is preferable to a single confounded index. Hurlbert has discussed this issue. You may also wish to consider the diversity spectra - ie how does the diversity measure change as you increase sample size. The indices are supposed to be independent of sample size - but rarely are. Margalef has published on diversity spectra.
The key point in the definition of diversity in the microbial context is how the units of description of the bacterial community are compared. In the conventional species-based diversity indices (e.g. Simpson), it is assumed that any two individuals belonging to different species have maximum dissimilarity, but such a crisp definition of types is not always interesting or possible in microbial communities. The units in microbial communities may be gene sequences for which no species identification in the conventional sense may exist, but for which a genetic or functional distance matrix is defined. Even if a classification is defined, assuming that every type is equally contributing to diversity no matter how distinct the types are from each other, may not be the best option. Measures such as Rao entropy, that combine the proportional presence or abundance of the units in the community with their genetic or functional dissimilarities, may be more meaningful and useful for interpretation.
A diversity index is a quantitative measure that reflects how many different types (such as species) there are in a dataset (a community), and simultaneously takes into account how evenly the basic entities (such as individuals) are distributed among those types. When diversity indices are used in ecology. Whatever method is used to define operational taxonomic units (OTUs), suitable measures are required to describe and, more importantly, compare these highly ... of diversity indices have been used with bacterial communities, in particular the ubiquitous Shannon index, the evenness indices derived from it. There are many ecological diversity measures, but their suitability for use with highly diverse bacterial communities is unclear and seldom considered. ... index alpha; the Q statistic (but only if coverage is 50% or more); the Berger-Parker and Simpson's indices, although their ecological relevance may be limited. Why choose the Shannon index (H′) as a default? The index, which is the negative sum of each OTU's proportional abundance multiplied by the log of its proportional abundance, is a measure of the amount of information (entropy) in the system and hence is a measure of the difficulty in predicting the identity of the next individual sampled [8]. It is positively correlated with species richness and evenness and gives more weight per individual to rare than common species.