I am looking for works for an understanding scenario. I found this paper where the authors
are able to process the video and create a relationship chart having different entities in the video.
https://ieeexplore.ieee.org/document/9079879
I am looking for if there are papers where the system is able to understand the situation and context