06 September 2024 1 2K Report

I am curious how PWMs are generated from a ChIP-SEQ data set (i.e., how is the data processed?).

My understanding is that a PWM is an aggregate way of presenting the binding preference of a DNA binding protein. The raw data starts out as of sequences from perhaps thousands of 'hits' / matches to the genome.

How are these then aligned? Is the entirety of each 'hit' sequence used? Can the same dataset give multiple different PWMs?

More Douglas Diehl's questions See All
Similar questions and discussions