Pre-processing can find significant points in the image and then rotate the randomly selected patches around those points until certain objective alignments are found. These are all standard object recognition techniques. But how can RBM's or DBN's or any deep network for that matter learn the basis to recognize an object or a shape just from patches (or whole images for all I care) in orientation invariant way?
I have asked a related question before and got many useful answers (regarding pre-processing). Thanks! This question is more general. Why do we need to pre-process at all if the most common basis of deep learning algorithms are "edge detectors"?