Hello everyone,

Could some please explain to me what do the data values in a WIG file mean? I have read the docmentaion about this format on the UCSC webpage and also checked for an answer on the biostar forum.

This question is a bit similar to the one posted earlier:

https://www.biostars.org/p/16261/

I understand that a WIG file can have data values as log2(IP/input), p-values, base coverage, etc. but I am trying to understand this in terms of peaks called using MACS.

On performing peak calling using MACS version 1.4.2, the header for the wiggle file says "Extended tag pileup from MACS version 1.4.2 20120305 for every 10 bp".

Does this mean that the data value in the WIG file at a given location represents the number of reads aligned within a 10bp window at that location? To check this, I visualised the WIG file along with its BAMs (Input and ChIP) on IGV but didnt see a coherence between the two. For instance, one peak had 18 reads at its summit while the data value in the WIG here was 63 and the peak constituted of a total of 94 reads. The read length is 50 bp while d= 211 (MACS) and so I checked for the total number of reads (and also number of reads extended reads d or d/2) within a window of length 'd' or 'd/2' centered at the summit (WIG value 63) and even then the read count was not 63. In addition, for regions without any aligned read (outside the peak region) the data value was 1.

So how does MACS calculate the data value for WIG and what does it mean if not the read count?

EDIT: Duplicate reads from the BAM file were removed prior to peak calling and hence duplicate-free BAM was compared with its wiggle file.

https://www.biostars.org/p/16261/

Similar questions and discussions