Actually, I have thought to implement HOG feature with temporal context for video data. Dalal and Triggs HOG is for 2 Dimension image. I want to implement it for video sequences as a feature for human action recognition. Where, you have to find gradient in three directions (x,y and t) and follow the procedure of traditional HOG with some nominal changes. So, is anyone has already used this technique? Is this concept is worth as efficient feature?