• I'm currently working on a project to extract text from document- images (like passport and license) and storing the passport number and driving license number along with the name of the person in a database.
  • I have used Pytesseract for the same.
  • Does Pytesseract use any of the Neural Network Algorithms?
  • The code with the sample image and output IS ATTACHTED BELOW.

from PIL import Image import pytesseract pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files (x86)\Tesseract-OCR\tesseract.exe' im = Image.open('C:/Users/Kiran Lalwani/Desktop/dss/56db21ec-8d4d-4128-889a-948e81eb7127.jpg') text = pytesseract.image_to_string(im, lang='eng')

  • Is there any other more efficient method?
  • What about Tesseract-OCR or OpenCV or CNN or MATLAB for text extraction?
More Sonia Lalwani's questions See All
Similar questions and discussions