Hello,
I am searching for literature on methods employed for choosing object classification when multiple object classifications (e.g. labels) are given with relatively high-confidence during analysis of a live video feed frame by frame (e.g. frame0:truck, frame1:sedan, frame2:sedan, frame3:truck)?
For instance, say a large truck enters the field-of-view of the video camera and is detected. The image segment within the provided bounding box from the detection algorithm is classified as a sedan with 95% confidence in the first frame, then subsequently classified as a truck with 95% confidence in the following frame. This process continues, resulting in multiple object classifications (i.e. labels) that vary between sedan and truck with high confidence values. How can I select the correct optimal classification? Are there methods that utilize all the classification (labels) of the object to select the optimal classification?