What is the best way to do QC with plink ?

19 May 2020 1 7K Report

This post is going to be a little long, so I apologize in advance for bothering you. I learned that quality control is super important so I want to make sure that I do this right.

I have several questions for my QC pipeline but hopefully nothing complex. I am just having some hard time to find the right commands. and when I do I am not sure of the inputs. Note that I am using plink v1 (because all the examples I found use plink v1) so if you can give me the v2 version that would be helpful but it is not my main concern. Note also that I am using data from UK biobank so every chromosome is in separate files (genotyped: .bed .bim . fam / imputed: .bgen .mfi .sample)

My pipeline is based on 2 parts :

1- per individual filtering

a. Removal of individuals with excess missing genotypes b. Removal of individuals with outlying homozygosity values c. Removal of samples showing a discordant sex d. Removal of related or duplicate samples e. Removal of ancestry outliers

2 - per sample filtering

a. Removal of SNPs with excess missing genotypes b. Removal of SNPs that deviate from Hardy-Weinberg equilibrium 3. Removal of SNPs with low minor allele frequency c. Comparing minor allele frequency to known values

My code:

1.a: ./plink --bfile ukb_cal_chr{}_v2_reduced --missing --out output1 comment: use R to remove individuals with missingness >.05

1.b: ./plink --bfile ukb_cal_chr{]_v2_reduced --het --out output2 comment: use R to remove individuals with the absolute value of F >.05

1.c: ./plink --bfile ukb_cal_chr{}_v2_reduced --check-sex --out output3 comment: not sure what the input is ? then in the output remove individuals with status =problem

1.d: ./plink --bfile ukb_cal_chr{}_v2_reduced --genome --min 0.05 --out output4 comment: using R in the output output4.genome, for every pair remove the one with the lowest genotyping rate (unless there is a command for that in plink ) (is that right?)

!!! However, I found that --genome takes too much time, is there another way?

1.e: ..... comment: I found this command :

plink --file data --cluster --neighbour 1 5

comment: but I am not sure what it did and how to use the output to filter the individuals and what the input file is (file or bfile)

2 - a,b,c : ./plink --bfile input --maf 0.01 --hwe 1e-6 --mind .1 --geno .1 --make-bed --out output

That's it for my pipeline. my main questions are related to the red parts, so just 3 questions. Also, if you found errors in my pipeline can you please correct me?

In conclusion here are my 3 questions:

- since I have one file for each chromosome, is the input of the command 1.c , the chromosome X?

- the command -- genome takes a lot of time, is there a way to speed it up or to estimate the relatedness of individuals in another way?

- I am still not sure how to filter ancestry outlier using pca?

Can you please help me? thank you

Peymaneh Davoodi

Dear Emile Dimas

first Q:

plink --bfile data --missing-genotype N --make-bed --mind 0.05 --maf 0.05 --geno 0.1 --hwe 1e-6 --recode --out gwasclean

Your way of QC is correct.

second Q:

It is better to use GCTA and split the calculation by applying --thread commend or calculate --grm for each chromosome and finally merge them together in an individual file.

third Q:

there is a way to put a couple of first PCs according to their eigenvalues as covariates in your analysis for adjustment and lowering inflation rate.

Regards

Badges
Science topic

Similar topics
Filing

More Emile Dimas's questions See All

What are the effects of using Adobe Photoshop on students? And what skill development do students have while using Adobe Photoshop?

I need articles about the influence of students on the use of the Adobe Photoshop application in everyday life and articles on developing students' design skills through the Photoshop application...

06 April 2023 7,657 0 View

Is there Artikel or Journal about Application Photoshop Theory?

I need some Artikel or a journal to fulfil my research about applying Photoshop theory for my Artikel titled Improving Student Creativity at SMAN 13 Semarang in Computer Extracurriculars Through...

05 April 2023 9,598 1 View

How to generate one allele knockout cell model using CRISPR/Cas9 tool?

Hi! I need to generate a monoallelic and then biallelic knockout cell model for a gene. I'm looking for the best tool to get the two cell models, I tried to do it using the CRISPR/Cas9 techniques...

20 September 2022 155 0 View

What would be the best way to create an ethnographic database in order to facilitate comparative work ?

I am seaching for a software that would enable me to create a database constitued of PDF files of older ethnographies of north american indigenous peoples. Here I'm thinking of ethnographies like...

19 January 2021 8,714 5 View

Aquamax DW4 Plate Washer Software?

Our lab recently inherited an Aquamax DW4 plate washer. This device is no longer supported/serviced by Molecular Devices (manufacturer). It did not come with the CD-ROM and we are wondering if...

10 June 2020 8,650 0 View

Can I add fresh DTT and PMSF to previous buffers?

I have a set of buffers (Tris, imidazole, phosphate) for NI-NTA protein purification. I prepared them about three weeks ago and they contain DTT and PMSF. Can I add fresh DTT and PMSF to the...

03 July 2019 2,670 3 View

Is the test of hypovirulence candidate? T test or non parametric test?

I want to compare between isolates (n = 15) based on virulence tests on apples and determine which isolates have hypovirulence . I use the t-test but the data is not normally distributed. Do I...

18 May 2019 9,408 4 View

Lipid abundances to enzyme pathways?

Imagine we had two subject groups: controls and patients with a certain condition, altering lipidomes of their blood. Then we obtained abundances of each molecular species for both groups and...

05 February 2018 8,386 1 View

Relation between film density / texturation / bandgap ?

Dear all, I deposited CrN film using different conditions. I observed a strong texturation along (111) direction (already reported in numerous publication), but a large variation of the films...

03 January 2018 5,236 3 View

Hot papers in soil science or soil polution ?

Hello everyone, I want to know how can i be updated on the hot papers/articles in soil science, soil pollution,... any help i will appreciate Thanks a lot for your replies.

24 August 2017 5,601 10 View

Why do men not accept that continually hassling for sex proves that they want it more than their partner?

Your partner’s not there to service you, it’s not their job to keep you sexually satisfied. You’re together because you love each other and want to make each other happy. Constantly hassling them...

08 August 2024 1,491 0 View

I need JCPDS file of LSFCO nanomaterial. Can anyone provide me?

Require for Rietveld refinement in XRD.

08 August 2024 3,081 2 View

How to understand this crystallographic phenomenon of low temperature crystals in zeolite?

During low-temperature testing, new diffraction peaks that appear could be indicative of several phenomena. In one of our tests, we observed notable new peaks around 40° and 45° in a specific...

06 August 2024 726 3 View

How to fix background error in rietveld refinement of one XRD peak using GSAS-II?

I want to refine one XRD peak of my in-situ xrd but the background is never working good which ultimately fails the refinement. How to refine and adjust the background using GSAS-II

05 August 2024 5,291 2 View

Why do women not understand that men are aroused by physical contact?

Women often complain that their husbands never touch them unless they want sex. (Michele Weiner-Davis)

02 August 2024 7,778 2 View

How to calculate the molar proportions of the oxides from the XRF analyses?

02 August 2024 2,565 2 View

Hello, regarding Mxene 2D titanium carbide?

I fabricated Ti3C2Tx using concentrated HF 40%, I plot an XRD as attached image below.. please let to know if I obtained it or not.

02 August 2024 6,789 4 View

Why do women usually need more persuading than men do to have sex with a new lover?

Women need to feel a degree of sexual intimacy before sex becomes desirable… For women, intimacy sometimes results in sex; for men, sex sometimes results in intimacy. (Marina Muratore)

31 July 2024 8,860 0 View

Why do men and women confuse platonic love and sex?

Women associate affection with love. … Men associate affection much more directly with sex. … Men see affection of any kind as a sexual invitation. Many women find this bewildering. (Kramer &...

30 July 2024 9,498 2 View

Why Powder XRD is performed after cruhsing a crystal if Single Crystal XRD analysis of the same crystal has been performed already ?

What information we can get from PXRD analysis other than from SCXRD analysis of a crystal ?

30 July 2024 6,261 4 View