Can I use only one parameter? I know that they use different equations to calculate but I still do not understand why we have to analyze both of them. Many thanks.
The differences between heterozygosity and polymorphisms are subtle yet distinct. Let’s start by breaking down these words into their Latin and Greek roots in order to understand their meanings. For heterozygosity, “hetero” means “different” and “zygote” means “seed.” From a genetics perspective, “seed” would refer to the allele. Thus, the term heterozygous would mean “different alleles.” Diploid organisms carry two alleles for each gene. When an individual carries two different alleles for a given gene, they are “heterozygous.” Often, one allele is dominant to the other and the dominant phenotype is observed.
Polymorphisms
For polymorphisms, “poly” means “many” and “morph” means “form.” From a genetics perspective, “form” could refer to the “genotype” or “phenotype,” depending on the context. Thus, the term polymorphism could mean “many genotypes” or “many phenotypes.” The term has been adopted by a variety of biological disciplines (zoology, molecular biology, genetics, etc.) where it now has slightly different meanings. Polymorphism most often refers to genotypic differences within a species resulting from genetic variation, but it may also describe phenotypic differences.
Polymorphic information Content (PIC)
At the molecular level, the term polymorphism is most often used by molecular biologists to describe genotypic variation, including single base pair variability (or small nucleotide polymorphisms, SNPs) and larger changes (such as duplications or deletions) within a gene or the genome. The PIC value will be almost zero if there is no allelic variation and it can reach a max of 1.0 if a genotype has only new allele which is a rare phenomenon. This is mainly to assess the diversity of a gene or DNA segment in a population which will throw light on the evolutionary pressure on the allele and the mutation the locus might have undergone over a time period.
The differences between heterozygosity and polymorphisms are subtle yet distinct. Let’s start by breaking down these words into their Latin and Greek roots in order to understand their meanings. For heterozygosity, “hetero” means “different” and “zygote” means “seed.” From a genetics perspective, “seed” would refer to the allele. Thus, the term heterozygous would mean “different alleles.” Diploid organisms carry two alleles for each gene. When an individual carries two different alleles for a given gene, they are “heterozygous.” Often, one allele is dominant to the other and the dominant phenotype is observed.
Polymorphisms
For polymorphisms, “poly” means “many” and “morph” means “form.” From a genetics perspective, “form” could refer to the “genotype” or “phenotype,” depending on the context. Thus, the term polymorphism could mean “many genotypes” or “many phenotypes.” The term has been adopted by a variety of biological disciplines (zoology, molecular biology, genetics, etc.) where it now has slightly different meanings. Polymorphism most often refers to genotypic differences within a species resulting from genetic variation, but it may also describe phenotypic differences.
Polymorphic information Content (PIC)
At the molecular level, the term polymorphism is most often used by molecular biologists to describe genotypic variation, including single base pair variability (or small nucleotide polymorphisms, SNPs) and larger changes (such as duplications or deletions) within a gene or the genome. The PIC value will be almost zero if there is no allelic variation and it can reach a max of 1.0 if a genotype has only new allele which is a rare phenomenon. This is mainly to assess the diversity of a gene or DNA segment in a population which will throw light on the evolutionary pressure on the allele and the mutation the locus might have undergone over a time period.
You do NOT have to; both are measures of diversity: H (also called gene diversity for haploid markers) is the best (robust, unbiased: Estimation of average heterozygosity and genetic distance from a small number of individuals. Nei M.Genetics. 1978 Jul;89(3):583-90); PIC was designed for linkage analysis (Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Botstein D, White RL, Skolnick M, Davis RW. Am J Hum Genet. 1980 May;32(3):314-31) [The polymorphism information content of a marker is the probability that the marker genotype of the offspring of a heterozygous parent affected with a dominant disease allows one to deduce which marker allele the offspring inherited from the parent. It is a measure of a marker's usefulness for linkage analysis.(http://onlinelibrary.wiley.com/doi/10.1002/0470011815.b2a05078/abstract)]
As I stressed before, you do NOT need to use both. Formula can be found in the references I have send previously. It is difficult to help you further wihout knowing the aims of your work.
Thx Alan and Antonio. I would like to do the parentage testing in pigs and I know that PIC and H values are used for evaluate microsatellite markers. I understand a little bit more that PIC will account for the loss of information due to homozygous parents (MiMi) and heterozygous parents both with the same genotype (MiMj) . The latter part does not appear in the H value calculation.
The best statistic to asses the informativeness of a marker for parentage is H: the higher the better. Perhaps the reading of AMORIM A, PEREIRA L (2005) Pros and cons in the use of SNPs in forensic kinship investigation: a comparative analysis with STRs. Forensic Sci Int. 150:17-21. will be of some help. Do not forget that the choice of makers has a heavy technical component (multiplexing capacity, reliability of typing, ...)
I have received a new msg from you outside this Q&A, to which I was not able to answer via Research Gate.
I do not see what is the problem, since, as I have said before, PIC is a diversity measure not appropriate for forensic, namely paternity investigation.
In any case, the approach of Botstein is explained in the attached file. The final formula is the summation of the mating type frequencies weighted by the probability of informative offspring (for linkage purposes).
Thanks for your concern. I attach a file how I calculate PIC values with 2 equations for your consideration. Could you please give me some suggestions?
Finally I've got it!! Thank you very much for your valuable suggestions, Prof. Amorim. I also learn that both Eq.1 and Eq.2 give the same results which are remarkable!!
The PIC was a good index for genetic diversity evaluation. Botstein et al. (1980) reported that PIC index can be used to evaluate the level of gene variation, when PIC>0.5, the locus was of high diversity; when PIC
Picking up on this thread late...thanks for the file and for the insight on PIC.
From the formula:
1. PIC must always be less than H
2. This difference will generally be slight because it is due to the product of squared frequencies. Even when summed across all possible heterozygote combinations these are still likely to be small.
3. PIC is relevant to parentage calculations, and is a better measure than H. It incorporates the chance that an offspring receives the same allele from both parents AND the chance that the offspring is heterozygous but either allele could have been received from either parent. With a known maternal genotype this correction accounts for the increased difficulty in inferring the paternal allele from the offspring genotype.
The difference between H and PIC is instructive. PIC will be closer to H when there are more alleles and with increasing evenness of allele frequencies (where it is less likely that individuals will have identical heterozygote genotypes). It is therefore a useful measure of effective diversity that may be compared across loci and populations.
My understanding of H and PIC is that while the former can be interpreted simply (assuming Hardy-Weinberg equilibrium) as the probability of a randomly selected individual having a heterozygous genotype for a variant / marker, the latter is the probability of a randomly selected individual having a heterozygous genotype for a variant / marker that is not the same as BOTH of his / her parents. The latter is, of course, more directly appropriate for family linkage studies, as a parent-offspring trio in which all three have the same heterozygous genotype is not informative. Both H and PIC are similar in that both are maximal for variants / markers with a larger number of alleles and where the distribution of frequency across the alleles is as close to equal as possible. In practice, the PIC value will always be lower than the H value.
An online calculator for H and PIC for variants with up to 20 alleles is available here: https://www.genecalculators.net/pq-chwe-polypicker.html.