I am looking for a multimodal dataset for image classification.

Description of the dataset

Image and corresponding description of the image in text.

It should be a single class image. 

Similar questions and discussions