End-to-End OCR Pipeline with Keras and TensorFlow

Use Keras-OCR for simple yet powerful text recognition in Python.

What is Keras-OCR API?

Keras-OCR is a high-level, open-source Python library designed to streamline optical character recognition (OCR) tasks using the power of Keras and TensorFlow. Unlike traditional OCR systems that require extensive configuration, Keras-OCR offers an end-to-end pipeline with pre-trained models for both text detection (using the CRAFT algorithm) and text recognition (via a CRNN model). This combination allows developers to extract text from images, scanned documents, or even handwritten notes with just a few lines of code.

The library is optimized for real-world use cases, including:

Document digitization: Convert paper documents or PDFs into searchable text.
Automated data entry: Extract text from invoices, receipts, or forms.
Accessibility tools: Generate alt text for images in web applications.
Social media analysis: Process text embedded in memes or user-generated content.

With built-in support for batch processing and optional GPU acceleration, Keras-OCR balances ease of use with performance, making it ideal for both prototyping and production deployments.

GitHub Stats

Name:
Language:
Stars:
Forks:
License:
Repository was last updated at

Key Features of Keras-OCR

Pre-trained Models: Includes CRAFT (detector) and CRNN (recognizer) for immediate use.
Easy Setup: Minimal dependencies (Keras, TensorFlow, OpenCV).
Batch Processing: Process multiple images in parallel for efficiency.
Custom Training: Fine-tune models on your own datasets.
No GPU Required: Runs on CPU but accelerates with GPU.
Bounding Box Output: Returns text with coordinates for spatial analysis.
Open Source: Free, community-driven, and MIT-licensed.

Installation

Install Keras-OCR via pip (requires Python 3.6+):

Install Keras-OCR


pip install keras-ocr

For GPU support, ensure TensorFlow with GPU is installed:

Install TensorFlow GPU


pip install tensorflow-gpu

Code Examples

Below are practical examples to extract text from images using Keras-OCR.

Keras-OCR text detection

Example 1: Basic Text Detection and Recognition

This example shows how to use the pre-trained pipeline to extract text from an image:

Basic OCR Pipeline


import keras_ocr

pipeline = keras_ocr.pipeline.Pipeline()
images = ["receipt.jpg"]
predictions = pipeline.recognize(images)
print(predictions)

Example 2: Batch Processing

Process multiple images at once for efficiency:

Batch Processing


import keras_ocr

pipeline = keras_ocr.pipeline.Pipeline()
images = ["image1.jpg", "image2.jpg", "image3.jpg"]
batch_predictions = pipeline.recognize(images)
for prediction in batch_predictions:
    print(prediction)

Example 3: Visualizing Bounding Boxes

Draw detected text boxes on the original image:

Visualize Results


import matplotlib.pyplot as plt
import keras_ocr

pipeline = keras_ocr.pipeline.Pipeline()
image = keras_ocr.tools.read("document.jpg")
predictions = pipeline.recognize([image])
keras_ocr.tools.drawAnnotations(image, predictions[0])
plt.imshow(image)
plt.show()

Conclusion

Keras-OCR simplifies text extraction with its ready-to-use pipeline, making it a great choice for developers who need fast, accurate OCR without complex setup. Its integration with Keras and TensorFlow allows for customization, while batch processing ensures scalability.

Whether you're building document scanners, automating data entry, or analyzing social media content, Keras-OCR provides a lightweight yet powerful solution.