I am currently working on video transcoding. I was going through literature and I saw that the authors of that literature have considered group of pictures for transcoding rather than video.
Is there any difference between video and group of picture?
How can we extract group of pictures from a video?
What is more efficient group of picture transcoding or video transcoding.
The GOP is, as its name shows, a group of frames arranged in a specific order (I-frames, B-frames and P-frames in case of the H.264/AVC standard). The coded video stream is actually a succession of GOPs of a specific size (e.g 8 or 12 frames, which is set in the header of the standard).
A GOP starts always with an I-frame (Intra coded frame or reference frame) also called "key frame". The size of the GOP is the distance between two consecutive I-frames (e.g IBBPBBPBBPBBI means that the size of the GOP is 12).
To extract GOP from a coded video, you should look at the ffmpeg documentation. One solution could be
I just want to add some supplement information for Ichraf Lahouli's answer. The concept of GOP is over in H.265 and future video coding. Since H.265 we re-define it as "coded video sequence (CVS)" which have more flexible in term of hierarchical structure and type of frames. In this CVS, repeated structure is used to reduce the bit signaling for the frame'structure and P frame is used similar to I frame in the signalling. This is more clear at Random Access profile.
Note that, H.265 also has the concept of intra-period which how often you see an I frame.
If you want to transcode from H.264 to H.265 you need to understand correctly this difference.
Beside, you should mention which tools you are working with. Are they reference software of H.264/H.265 or open source implementation like x264 and x265.
I would like to thank the colleagues Ichraf and Thuong for giving a precise definition for the term group of pictures and that this expression is related to the advanced video coding standards H264 and H 265.
I would like to add that the video itself is a sequence of frames. In progressive scanning every frame contains one picture. So, in principle the video can be considered a sequence of pictures played at at specific rate such as 25 pps or 30 pps for the different video systems.
It remains to ask you about what do you mean by transcoding? Is it as the colleague Thuong hinted the transcoding from H264 to H265 for example?