Create a Basic OCR using PyTesseract and OpenCV2 in Python

Topic: Create a Basic OCR using Pytesseract and OpenCV2 in Python.

To Do: Extract Text and/or data from the image(s).

Link to my GitHub Repository: https://github.com/eternaldemon/OCR-PyTesseract

Programming Language Used: Python 3.7

Software/Editor Used: Spyder 4.1.0 on Anaconda Navigator

Libraries used : Pytesseract and opencv-python

How to Install OpenCV-Python in Windows:

  1. Open cmd(command prompt) as administrator from windows search taskbar located on left bottom of screen. Or press CTRL + R, type cmd.exe and press Enter.
  2. Type the following: pip install opencv-python.
  3. Wait for it to finish and opencv2 has been installed for python.
       Note: If any error occurs it is mostly due to cmd not being open with administrative rights.

How to install and set-up PyTesseract in Windows:

  1. Install tesseract using windows installer available at: https://github.com/UB-Mannheim/tesseract/wiki
  2. Choose the installer out of 32 and 64 bit according to your system. To check you PC type, type System Info in Windows search bar and press Enter. Find System Type of the page and download accordingly.
  3. During Installation, check in which directory it was installed and add that directory to the path variable under the Environment variables.
  4. Now, open cmd with administrative privileges, type pip install pytesseract and press enter. PyTesseract shall be installed shortly.
  5. Hopefully, any code using the pytesseract library after importing it will run. But if it still doesn't run then after importing the library put the following line in the code file.
      pytesseract.pytesseract.tesseract_cmd = r'PATH TO tesseract.exe installed from step 1'
      Example:
     pytesseract.pytesseract.tesseract_cmd = r'C:\Users\USER\AppData\Local\Tesseract-OCR\tesseract.exe'

  • Now that both the required libraries have been installed, the following steps how to create a basic OCR.

Steps to create a Basic OCR:

  1. Import libraries and read any image. I have used 'test1.png' and read it as Gray-scale.(Lines 8-12)
  2. Configure the settings you want for your image to be processed by.(Line 30)
  3. Use the image_to_string/data function to convert the image into the form you want. I have used image_to_string since this is an OCR.(Line 32)
  4. Print the text or results.(Lines 34-35)

Following is the code:

Following is some theory for choosing Configurations:

  • Tesseract uses PSM (Page Segmentation Mode). PSM affects how Tesseract splits image in lines of text and words. Pick the one which works best for you.
# OSD - Orientation and Script Detection.
# 0 = Orientation and script detection (OSD) only.
# 1 = Automatic page segmentation with OSD.
# 2 = Automatic page segmentation, but no OSD, or OCR
# 3 = Fully automatic page segmentation, but no OSD. (Default)
# 4 = Assume a single column of text of variable sizes.
# 5 = Assume a single uniform block of vertically aligned text.
# 6 = Assume a single uniform block of text.
# 7 = Treat the image as a single text line.
# 8 = Treat the image as a single word.
# 9 = Treat the image as a single word in a circle.
# 10 = Treat the image as a single character.
  • Engine Mode (--oem): Tesseract has several engine modes with different performance and speed.
0 - Legacy Engine Only.
1 - Neural nets LSTM engine only.
2 - Legacy + LSTM engines.
4 - Default, based on what is available from the server.

My Result on the image used: 

Image Used



Result



Comments