I'll preface this by saying our NGS Core facility handles most of our analysis. So perhaps in my limited understanding of the analysis workflow, I'm not correctly articulating what I'm looking to do. Our core has been helpful but also has limited ChIP-seq experience and they aren't sure what I want to do is possible.
We are have run several ChIP-seq experiments seeking to identify where the Aryl Hydrocarbon Receptor binds in the mouse genome is response to various agonists. The current analysis workflow has produced a list of the most common motifs identified, the genomic region associated with peaks (promoter, intergenic, intragenic), and the prevalence of each motif in the dataset. This is useful, however, I'm interested in specifically which genes are located near these motifs. A specific motif to be precise. Our lab has previously reported a non-canonical binding site which consists of the tetranucleotide repeat GGGA. I want to know the location of peaks in the ChIP-seq dataset that correspond to this motif. Preferably the location of peaks corresponding to this motif within gene promoters (within 5kb of the TSS I suppose). The goal is to identify putative, non-canonical target genes. Is this possible?
Basically, we have a list of peaks, distribution of regions associated with peaks, and a list of specific motifs. How do I cross-reference them to find a specific motif that corresponds to specific peaks at specific regions (promoters)?