I am curious how PWMs are generated from a ChIP-SEQ data set (i.e., how is the data processed?).
My understanding is that a PWM is an aggregate way of presenting the binding preference of a DNA binding protein. The raw data starts out as of sequences from perhaps thousands of 'hits' / matches to the genome.
How are these then aligned? Is the entirety of each 'hit' sequence used? Can the same dataset give multiple different PWMs?