Hello everyone!
I have finished all the APT steps on my UK biobank CEL files. I understand that once this is done, SNPolisher should be used to verify the quality of probes/SNPs and also extract data into a suitable format to start the analysis. I have read the SNPolisher guide and vignettes but I don't understand these points:
- How critical is SNPolisher once the APT steps were completed satisfactorily? I understand that SNPolisher gives you specific details of the quality of probes/SNPs and helps you to figure out how clean or trustable is your data. Is the data obtained by - let's say P_visualization - useful to restrict the extraction of certain probes/SNPs? or are there specific thresholds (as in some QC steps of APT) to extract your data without the steps described in SNPolisher?
- I'm trying to run Ps_visualization: Ps_Visualization(pidFile,summaryFile,callFile,confidenceFile)
The input files for summaryFile, callFile, and confidenceFile were generated by APT. I'm trying to find the pidFile or an example of how it looks to make it myself, but I can't find either of both. This file is basically a list of probesets, but I don't know exactly where I can find this information.
Considering all the SNPs will be tested, is the pidFile really necessary?
- I can't find details on the SNPolisher guide of how CEL files or APT outputs can be converted into suitable formats such as PLINK or VCF. Could you please share any documentation explaining this step? and also the input files required for this?
Apologies if these questions are silly. I'm completely new in the GWAS world and it is very hard to understand these thermofisher documents.
Kind regards,
B