I want to learn how I can split the x-axis in which chromosomes are represented in segments with unequal lengths. In the y axis p-values of average ranks are represented.
I attach you the diagram I want to draw. It refers to genome-wide linkage scan, not GWAS. Can R split the x axis in unequal segments? I do not know R, and if a script is available, it would help me.
I just compared your attached pic with the package output, they should be similar if you change the y axis from the p-value obtained from GWAS result to the y axis you want.
Basically, the package requires the input file as:
SNP CHR BP P
1 rs1 1 1 0.9148
2 rs2 1 2 0.9371
3 rs3 1 3 0.2861
4 rs4 1 4 0.8304
5 rs5 1 5 0.6417
6 rs6 1 6 0.5191
with first column as SNP ID, second column as Chromosome, third column as genomic location/site and last column for the value of y axis. You may need to manually change the header of datasheet to the exact same string to let the software to recognize the column, then it will generate the plot with x axis in unequal segments.
The code for the plotting is all in the previous link, you just need to
1. install R with corresponding packages
2. re-format your data into the indicated format with proper headers
3. import both packages and data into R and run the code
I installed qq package and I have done successfully the Manhattan plot, but I cannot find the data file and I cannot change it in order to put my data. The data file must be an excel or txt file, right or not? I have downloaded Rstudio and it is easy. The only problem is the in the data file I did not find the file with data. I understood tha the script is these commands in the html page you told me, right?
I would really be grateful if you could help me. I hope to hearing from you soon.
Good, if you are sure that the manhattan plot is what you want, then you can use the Excel or any text editor to modify your own data. Try to convert your data into the format with the same heading as the "gwasResults" data sheet with comma or space/tab as delimiter and stored as corresponding text or csv format.
after you covert your data into the proper format. Try to use the R function "read.csv" or "read.table" to import your data into the R working space and store as one of the data.frame format. please use the "names" functions to ensure your colnames of data.frame is correct, and use "dim" to examine the domination of your dataset.
After that, you can direct substitute the original "gwasResults" in the code to the data.frame you created and make the plot.
Try to get familiar with R which is indeed a necessity for bioinfo researchers. Detailed explanation about those basic functions you might need for this tasks could be found in the courser online video:
https://www.coursera.org/learn/r-programming
Please have a trial and feel free to find me if there's any further question or any practical problem in the coding work.