I think that also depends on what specific video analysis you want to do. Is it about recognizing something? or is it about predicting next images in sequences in a video file? or is it about measuring something(e.g. attentiveness of the contents of the video, sentiments of the video) about the video? or is it about classifying the video's age limit automatically? etc..