Data from multiple ChIP-Seq studies for a particular transcription factor have been submitted to the GEO dataset in the wiggle format.

How does one go about correlating data from all these datasets? I would like to see if there are any regions conserved among all these datasets and by extension if they overlap with my ChIP-Seq data.

So far, I have used wig2bed (bedops) to convert wig files to bed and then intersect (Bedtools) them. However, I am doubtful if it is the right thing to do as the scores reported are in different formats for different studies. Some have probability scores, some tag numbers, some normalised reads per million.

In addition, I tried wigCorrelate (UCSC) among files but it does not give information about conserved regions. I'm not sure whether my approach here is correct.

Is there a way to identify peaks using wig files? Or is there a way to decide the threshhold for the scores?

Any other pointers to address the issue?

More Rishikesh Lotke's questions See All
Similar questions and discussions