Object Detection: You'll need a library/model for detecting objects in each frame of the video. OpenCV with pre-trained models like YOLO or SSD can be a good starting point.
Object Tracking: Once objects are detected, you need to track them across frames. Techniques like Kalman Filter or optical flow can be used.
Bounding Boxes: During tracking, you'll need to generate bounding boxes around the objects.
2. Libraries and Frameworks:
Here are some C++ libraries that can help you with different parts:
OpenCV: Provides computer vision functionalities like object detection, tracking, and drawing bounding boxes.
Dlib: Another computer vision library with object detection and tracking capabilities.
TensorFlow/PyTorch (with C++ API): If you want to use more advanced deep learning models for object detection, these frameworks have C++ APIs.
3. Putting it together:
The C++ code will involve:
Using OpenCV or another library to read the video frame by frame.
Applying an object detection model on each frame to identify objects.
Implementing a tracking algorithm to associate detections across frames.
Drawing bounding boxes around the tracked objects.
4. Resources:
Here are some resources to get you started:
OpenCV Tutorials: https://docs.opencv.org/4.x/d9/df8/tutorial_root.html (Look for tutorials on object detection and tracking)
TensorFlow C++ API: https://www.tensorflow.org/install/lang_c
PyTorch C++ API: https://pytorch.org/cppdocs/
Additional Notes:
Auto-anchoring bounding boxes might require additional logic based on your specific needs. It could involve adjusting box size based on object movement or interactions.
This is a complex task and might require significant coding effort depending on the desired level of accuracy and functionality.