In general, you need to extract features from the region of interest (roi), the roi can be selected, e.g., based on moving objects. Then you can use classification techniques to detect objects.
Regarding pedestrians detection, histogram oriented gradients (hog) features are Well known and perform well.
in addition i can say you need possitive and negative images that is include your object and scene for learning.and you have train that images by proper learning algorithm.also for preprocessing you need to normalize your dataset(cropping-background removing-scaling and..).in general for detecting anything you have to go this way:
image acquisition->image enhancement->image restoration->morphological processing->segmentation your pattern->object detection
Image Segmentation is the best way of detecting objects (or we can say the region of interest) in images. But, some pre-processing techniques are to be added before going to segment the image. Image enhancement techniques are often proved useful for this purpose. Neighbourhood processing (under spatial image enhancement techniques) are found more beneficial in this case.
There are many algorithms, and based on parameters and hyper-parameters each of them perform differently. Ultimately accuracy depends on quality of your dataset, few high quality datasets: