I have Images of the medical lab reports captured by the android device. All Images contain text data. I want to summarize the each and every report. summarization means for example in the USG report , the summary is like "No abnormality seen"., I want this text for each report . The structure of the model: Input is Image and gives summary in terms of text . I thought of Image captioning , but there is lot of similarities in the images. How can I do it ?
any ideas ?