A genotype data for Genome Wide Association Analysis (GWAS) is provided as 0, 1, and 2 format where individuals along the row and each column is a SNP. The minor allele frequency (MAF) is calculated as
MAF
In fact here you don't compute the MAF but the frequency of the second allele.
To get the real maf you can use existing functions (maf in package HardyWeinberg or in vcfR)
You can also compute it easily:
maf 0.5,1-colMeans(geno)/2,colMeans(geno)/2)
or if you keep your "MAF" ()the one you calculated) and you can do only:
geno_filtered 0.05 | MAF < 0.95 )]
For HWE test you can use HWChisq function of package HardyWeinberg or HWE.test in the package genetics.
The question of keeping or removing loci in disequilibrium depends on you question...
You can either use some R packages such as GEN ABEL, SNPassociation or Genetics to do it.
Some very useful R scripts for this purpose or related ones can be found here
http://www.evachan.org/rscripts.html
Cheers
Duy DN
geno_filtered = NAM::snpQC( geno )
A linear mixed model was fitted using lmer function of lme4 package in R. Does anyone have idea of how to extract studentized conditional residual for individual data point?
31 December 2019 2,168 3 View
Dear All, I am using ASREML-R to fit unstructured (UN) and factor analytic (FA) model to explore complex structure of genotype by environment interaction in multienvironment yield data. There are...
10 November 2019 815 4 View
I need to filter a vcf file for genotype quality and depth. How do I choose my filter parameter value? vcftools --gzvcf snps.txt.gz --quality ?? --depth ?? --recode --recode-INFO-all --out...
10 November 2018 3,263 0 View
These statistics measures association between ordinal variables: gamma, Kendall’s tau-, Stuart’s tau-, and Somers’. If I am to report correlation between ordinal and nominal variables, what is...
04 May 2016 3,058 2 View
Assuming 12 varieties are evaluated in 4 replicates in Randomized Complete Block Design (RCBD) where 20 variables are measured, is it appropriate to run correlation and principal component...
04 May 2016 2,686 3 View
A GSTAT library of R was used to generate a grid size of (i) 10 x 10 m and (ii) 10 x 5 m running from an Easting of 299677 m to 301297 m and a Northing of 5737278 m to 5738128 m. This resulted in...
10 November 2014 452 6 View
The four variogram models below are fitted to the same dataset. I got a warning message while cross validating a fitted linear variogram model as Warning message: In sqrt(ret[[var.name]]) : NaNs...
03 April 2014 5,713 2 View
A number of variogram models were fitted where estimates like nugget, range, and sill were obtained from gstat of R. Could anyone guide me on how to obtain standard error for each of these...
03 April 2014 7,178 2 View
I used gstat to fit a Matern variogram model to my crop yield by setting the kappa = 0.5. To my surprise, its parameter estimates are exactly the same with that of exponential. Does it mean that...
03 April 2014 7,486 3 View
An exponential model is fitted to empirical semivariogram using gstat of R package. The variogram parameters obtained is model psill range 1 Nug 0.04281109 0.00000 2 Exp 0.76810071...
02 March 2014 250 7 View
I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.
11 August 2024 9,101 4 View
I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?
11 August 2024 5,138 1 View
Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!
11 August 2024 3,770 4 View
Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...
10 August 2024 7,180 0 View
I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...
10 August 2024 7,429 2 View
"PUBLISHING IN A SCOPUS JOURNAL" Researchers are now at a cross road. The critical need to publish in a Scopus or ISI, etc journal is ever vital. Journal Publication fees must be submitted....
10 August 2024 8,621 1 View
Hello everyone, I am currently developing a thesis proposal and would appreciate your input on its viability and how to effectively carry it out. My proposed topic is: "Does the perceived threat...
10 August 2024 8,992 0 View
How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?
09 August 2024 7,718 0 View
Who will bear moral responsibility for the deaths of thousands of people in the event of an earthquake? Weeks and months remain before the onset of strong earthquakes that bring death to...
08 August 2024 6,134 12 View
After performing symmetric PCR, PCR purification was performed. Afterwards, asymmetric PCR was performed using the PCR purification product as a template, but no ssDNA band was confirmed in the...
08 August 2024 1,668 3 View