I am scoring SSR markers for genetic diversity analysis and I found some of markers with 4 alleles, but when I analysed them I got negative values, do you have any suggestions about why this is?
Hi, I do not know why you get negative values, but I would suggest reading De Riek et al 2001 for PIC determination for dominant markers (such as ISSR) which should give you a PIC value between 0 and 0.5, whilst codominant markers should give a PIC between 0 and 1 (please see Shete et al., (2000) Theoretical Population Biology 57:265-271)
Maybe something wrong with your equation. PIC values should be positive and varing between 0 to 1. Take a look in this article, Tessier, C., J. David, P. This, J. M. Boursiquot, and A. Charrier, 1999: Optimization of the choice of molecular markers for varietal identification in Vitis vinifera L. Theor. Appl. Genet. 98, 171—177.
PIC values are never negative and there is no min or max val. It m is said that PIC above 0.5 are informative markers. U can get PIC in power marker if the data is allelic
I have some SSR markers that give more than 1 specific band per genotype has 5 alleles. I did try to changes the annealing but still the bands are there. So if you calculated it that with the formula.
n
PICi= 1-∑ Pij2
j=1
Allele 1 (37 of the 70 genotypes are there) frequency 0.53
Has somebody a solution to calculate a markers PIC with 5 alleles whereby more alleles are there for one genotype. ( Fig-1),Further is this formula also incorrect if there are some heterozygote lines. This will result in the same problematic more bands than genotypes hereby the allele frequency will be incorrect. I have attached some pictures for your reference.
SSR markers may interrogate alleles at more then one locus. Thus, you may have SSR markers interrogates more then one locus. If the genome you have studied has more then one locus, first separate to the loci then calculate the PIC for each locus. I hope this help
The PIC value will be almost zero if there is no allelic variation and it can reach a max of 1.0 if a genotype has only new alleles which is a rare phenomenon. This is mainly to assess the diversity of a gene or DNA segment in a population which will throw light on the evolutionary pressure on the allele and the mutation the locus might have undergone over a time period
@Anand, if you could run your PCR fragments into Capillary machine to get the peak sizes, then do scoring of the peaks and run your co dominant microsatellite data into Power Marker software, I am sure you will get everything.
Dear Anand you mentioned that more than one band was amplified in some genotypes. When you use SSR markers for genetic diversity analysis, the real problem is scoring because some times more than one band will be amplified when there is a mutation at the annealing site of the primers in the flanking region. Secondly if there is a residual heterozygosity then also more than one band will be amplified. this will be the case when landraces are used in diversity analysis. So the best way is to compare your results with pcr product size of that particular marker from database like gramene. Confirm the results by performing electrophoresis in metaphore agarose or PAGE.
You should score each distinct as a separate locus so that the total for all allele frequencies for a single locus do not exceed 1.0. This is critical for getting correct values.
Negative value of PIC is possible when there are more than three alleles per genotype in many of the genotypes analyzed. The minus portion becomes larger and hence negative value of PIC.
Perhaps you should give us more precision about your model species in order for us to better answer your question. For instance we migth now ploidy of the species, how you score your alleles, and how you compute the PIC for each locus.
Gene Diversity, PIC and exp het are all analogous and give us a diversity measure. For example, common biased estimator for He is calculated by substracting the expected frequencies of homozygotes (under HW equilibrium) from 1, so the range of this measure would be 0-1.
Yes sometimes higher number of loci is amplified primarily due to PCR annealing temperature problems i.e. in case of co-dominant markers like SSR, which should not be misunderstood for genetic variation/ mutation. Because repeating the experiment may reveal the exact capability of that primer (optimum number of loci). So in case if you have PIC value greater than 1 it is probably due to PCR problem it is suggested to check the reproduciblity of the Primer.
Shubhneet Kaur and Philippe Cubry thanks for detailed elaboration.
When running an SSR analysis, for each pair of primer targeted for certain locus, I usually run a gradient PCR to determine the best Tm for each of the primers. The best temperature is then selected for PCR amplification and SSR analysis. Under local condition, the optimum Tm may not always be the same as one suggested by the company where the primers are synthesized.
As for maximum number of band (alele) for "each individual":
(1) if it is a haploid (haploid from a diploid plant): maximum number of band is one (1)
(2) if it is a diploid: maximum number of band is two (2) indicating it is a heterozygous individual, but it can also show one (1) band (a homozygous individual) or zero band (a homozygous null alele).
(3) if it is a triploid: maximum number of band is three (3), but it can also be two (2), one (1), or zerro (0).
(4) and so on depending on the the plant ploidy level.
As for the maximum number of band (alele) for "population":
Then it is a different matter. A population of 100 diploid individuals may actually have more than two (2) aleles in the population. However, it we take a look at certain locus and at only a single individual, thus the maximum band (alele) are only two (2). It can be one (1) or zerro (0).
I am not a good statistician, but there are a number of software that can be used to determine PIC when you have got the genotype score for each of the individual you analyze. I usually use CERVUS to get the PIC information from my SSR genotype score data. It may not be appropriate, but this software help me do the PIC calculation based in SSR score. I believe there are other software that can do the same.
Good luck with your analysis. I hope I can add some additional information, even though it may not answer your question directly.
When we are trying several solutions to a given problem and do not get success means that there are more than one error involved and in this case we must first identify them.
Possible errors.
1 Are you actually working with SSR or would be ISSR?
2 Are you correctly counting the bands in the gel?
3 Are you using the appropriate method of calculation?
Possible Solutions.
1- SSR is a codominant marker while ISSR is dominant
2- Consider explained here by Sudarsono in relation number of expected bands for each individual.
3 As explained here by Marguerite Blignaut, different PIC values are expected depending on the type of marker. For dominant marker the correct formula is PIC = 2PiQi, where Pi is the frequency of presence and Qi is the frequency of
absence of a particular band (Tehrani et al. 2008)
You apply this formula to each band (allele) and then calculates the average for the primer.
To dominant RAPD marker which exhibits many bands the use of the wrong formula leads to a negative PIC value because the larger number of bands (allele) per primer results in a high value for the sum of the squares of the frequencies.
This same problem will occur if you take more than two bands per diploid individual in the SSR marker. Many bands results in high value for the sum of the squares of the frequencies.
I do not know if non polymorphic bands should be considered in the calculation ( asked by Anu Cyriac.) I also would like to know, because for these bands the PIC value will be Zero reducing the PIC value for primer (marker).
I am scoring SSR markers for genetic diversity analysis in sugarcane genotypes and I found some of markers with more than 4 alleles, but when I analysed them I got negative values for PIC, Anybody have some suggestion for me?
For Sugarcane a highly polyploid genome, number of alleles may be always more than three or so! Usually allelic diversity should be more, have to clarify with the negative value, can you please re do the data with some other software too..
The primer pairs with similar allele numbers can differ greatly in their PIC values because PIC value depends on allelic distribution. Therefore, analysis of amplification patterns based on PIC value indicates allelic richness because of the presence of significantly higher frequency of unique alleles, which makes a greater contribution to the genetic diversity of the entries.
Since you are getting more than 2 alleles so there are chances of negative values but it is very rare. Usually PIC values range from 0-0.5 for dominant markers like RAPD and ISSR and 0-1 for codominat markers like SSR