I will soon try to apply a statistical test to the 1000g data. Because the amount of data is huge, I was wondering if anyone is willing to share their best solutions to working with this kind of data. E.g. use R packages to read in VCF files or not? Edit VCF files (bash, perl etc) before reading them into R, using VCF tools or not, doing everything on the fly or saving intermediate versions with nothing but the data you actually need, etc. I appreciate any suggestions.

Similar questions and discussions