* To classify images, Convolutional Neural network are the state of art achieving high accuracy in well known dataset like imagNet.
* For text extraction in images, you need an algorithm to propose a Region of Interest that could contrain written text. Here is a useful link (https://deepsense.ai/region-of-interest-pooling-explained/). Once you have a region it's becomes a simle (OCR) problem.
* Third problem, is a straighforward save in a database.
Sonia Lalwani Adobe Acrobat professional have in-built OCR (Optical Content Reader) option to read text from scanned images. You can also use other OCR tools to read text from your target scanned images.