We have an wearable eye-tracking glasses that is video-recording the stimulus presented in front of the subject, and measuring gaze position in the coordinates of the video. However, we need to map the gaze position from the video's coordinate to the static stimulus image's coordinate, frame by frame (because the subject's head is moving).
I think this is a classical computer vision problem. Consider the following two images (the first is a frame in recorded video, and the second is the static), and I think what I need is a algorithm that can register the static image to every video frame. But I am not an expert is computer vision/artificial intelligence. What is the current best algorithm to accomplish this task?