I'm working with a RVL-CDIP dataset( https://www.cs.cmu.edu/~aharley/rvl-cdip/) with 16 classes labelled at image level (a large collection of documents, ranging from ads to scientific articles). I'm trying to use weights from Faster- and Mask R-CNN pretrained on different ICDAR data (focused scene texts, receipts and similar). The results compared to baseline ResNet18/34/50 models are not very good. In order to pretrain Faster- and Mask R-CNN on a more relevant data, I need a document dataset labelled at object level (boxes or masks), however small. Does such dataset exist?