I have studied about attention mechanism and have seen its application in Natural Language Processing (NLP) and Computer Vision(CV). NLP is not my area of interest. However, CV is my area of interest but attention mechanism is found to be applicable in Image Captioning, which is a part of Information Retrieval. Image Captioning deals with long subsequences of text to convert unstructured image-related data to structured data. It does not relate directly with images but extracts text-based information from images. I want to use attention mechanism directly upon images, through its integration with CNNs especially for better feature extraction , selection and classification. Suggestions would be of great help.