A package for extracting structured data from receipts.
Project description
Receipt OCR Engine
An efficient OCR engine for receipt image processing.
This repository provides a comprehensive solution for Optical Character Recognition (OCR) on receipt images, featuring both a dedicated Tesseract OCR module and a general receipt processing package using LLMs.
Star History
Project Structure
The project is organized into two main modules:
src/receipt_ocr/: A new package for abstracting general receipt processing logic, including CLI, programmatic API, and a production FastAPI web service for LLM-powered structured data extraction from receipts.src/tesseract_ocr/: Contains the Tesseract OCR FastAPI application, CLI, utility functions, and Docker setup for performing raw OCR text extraction from images.
Prerequisites
- Python 3.x
- Docker & Docker-compose(for running as a service)
- Tesseract OCR (for local Tesseract CLI usage) - Installation Guide
Usage Examples
Receipt OCR Module
This module provides a higher-level abstraction for processing receipts, leveraging LLMs for parsing and extraction.
To use the receipt-ocr CLI, first install it:
pip install receipt-ocr
-
Configure Environment Variables: Create a
.envfile in the project root or set environment variables directly. This module supports multiple LLM providers.Example
.envfor OpenAI:Get it from here: http://platform.openai.com/api-keys
OPENAI_API_KEY="your_openai_api_key_here" OPENAI_MODEL="gpt-4.1"Example
.envfor Gemini:Get it from here: https://aistudio.google.com/app/apikey
OPENAI_API_KEY="your_gemini_api_key_here" OPENAI_BASE_URL="https://generativelanguage.googleapis.com/v1beta/openai/" OPENAI_MODEL="gemini-2.5-pro"Example
.envfor Groq:Get it from here: https://console.groq.com/keys
OPENAI_API_KEY="your_groq_api_key_here" OPENAI_BASE_URL="https://api.groq.com/openai/v1/models" OPENAI_MODEL="llama3-8b-8192" -
Process a receipt using the
receipt-ocrCLI:receipt-ocr images/receipt.jpgThis command will use the configured LLM provider to extract structured data from the receipt image.
sample output
{ "merchant_name": "Saathimart.com", "merchant_address": "Narephat, Kathmandu", "transaction_date": "2024-05-07", "transaction_time": "09:09:00", "total_amount": 185.0, "line_items": [ { "item_name": "COLGATE DENTAL", "item_quantity": 1, "item_price": 95.0, "item_total": 95.0 }, { "item_name": "PATANJALI ANTI", "item_quantity": 1, "item_price": 70.0, "item_total": 70.0 }, { "item_name": "GODREJ NO 1 SOAP", "item_quantity": 1, "item_price": 20.0, "item_total": 20.0 } ] }
-
Using Receipt OCR Programmatically in Python:
You can also use the
receipt-ocrlibrary directly in your Python code:from receipt_ocr.processors import ReceiptProcessor from receipt_ocr.providers import OpenAIProvider # Initialize the provider provider = OpenAIProvider(api_key="your_api_key", base_url="your_base_url") # Initialize the processor processor = ReceiptProcessor(provider) # Define the JSON schema for extraction json_schema = { "merchant_name": "string", "merchant_address": "string", "transaction_date": "string", "transaction_time": "string", "total_amount": "number", "line_items": [ { "item_name": "string", "item_quantity": "number", "item_price": "number", } ], } # Process the receipt result = processor.process_receipt("path/to/receipt.jpg", json_schema, "gpt-4.1") print(result)
Advanced Usage with Response Format Types:
For compatibility with different LLM providers, you can specify the response format type:
result = processor.process_receipt( "path/to/receipt.jpg", json_schema, "gpt-4.1", response_format_type="json_object" # or "json_schema", "text" )
Supported
response_format_typevalues:"json_object"(default) - Standard JSON object format"json_schema"- Structured JSON schema format (for newer OpenAI APIs)"text"- Plain text responses
This will output the same structured JSON as the CLI.
-
Run Receipt OCR as a Docker web service:
For a production-ready REST API, use the FastAPI web service:
docker compose -f app/docker-compose.yml up
The service provides REST endpoints for receipt processing:
GET /health- Health checkPOST /ocr/- Process receipt images with optional custom JSON schemas
Example API usage:
# Health check curl http://localhost:8000/health # Process receipt with default schema curl -X POST "http://localhost:8000/ocr/" \ -F "file=@images/receipt.jpg" # Process with custom schema curl -X POST "http://localhost:8000/ocr/" \ -F "file=@images/receipt.jpg" \ -F 'json_schema={"merchant": "string", "total": "number"}'
For detailed API documentation, visit
http://localhost:8000/docswhen the service is running.
Tesseract OCR Module
This module provides direct OCR capabilities using Tesseract. For more detailed local setup and usage, refer to src/tesseract_ocr/README.md.
-
Run Tesseract OCR locally via CLI:
python src/tesseract_ocr/main.py -i images/receipt.jpg
Replace
images/receipt.jpgwith the path to your receipt image.Please ensure that the image is well-lit and that the edges of the receipt are clearly visible and detectable within the image.
-
Run Tesseract OCR as a Docker service:
docker compose -f src/tesseract_ocr/docker-compose.yml up
Once the service is up and running, you can perform OCR on receipt images by sending a POST request to
http://localhost:8000/ocr/with the image file.API Endpoint:
- POST
/ocr/: Upload a receipt image file to perform OCR. The response will contain the extracted text from the receipt.
Note: The Tesseract OCR API returns raw extracted text from the receipt image. For structured JSON output with parsed fields such as merchant name, line items, and totals, use the
receipt-ocrinstead.Example usage with cURL:
curl -X 'POST' \ 'http://localhost:8000/ocr/' \ -H 'accept: application/json' \ -H 'Content-Type: multipart/form-data' \ -F 'file=@images/paper-cash-sell-receipt-vector-23876532.jpg;type=image/jpeg'
- POST
Contributing
We welcome contributions to the Receipt OCR Engine! To contribute, please follow these steps:
-
Fork the repository and clone it to your local machine.
-
Create a new branch for your feature or bug fix.
-
Set up your development environment:
# Navigate to the project root cd receipt-ocr # Install uv curl -LsSf https://astral.sh/uv/install.sh | sh # OR pip install uv # Create and activate a virtual environment uv venv --python=3.12 source .venv/bin/activate # For Windows, use .venv\Scripts\activate # Install development and test dependencies uv sync --all-extras --dev uv pip install -r src/tesseract_ocr/requirements.txt uv pip install -e.
-
Make your changes and ensure they adhere to the project's coding style.
-
Run tests to ensure your changes haven't introduced any regressions:
uv run pytest
-
Run linting and formatting checks:
uvx ruff check . uvx ruff format .
-
Commit your changes with a clear and concise commit message.
-
Push your branch to your forked repository.
-
Open a Pull Request to the
mainbranch of the upstream repository, describing your changes in detail.
LinkedIn Post
- Gemini Docs: https://ai.google.dev/tutorials/python_quickstart
- LinkedIn Post: https://www.linkedin.com/feed/update/urn:li:activity:7145860319150505984/
License
This project is licensed under the terms of the MIT license.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file receipt_ocr-0.3.1.tar.gz.
File metadata
- Download URL: receipt_ocr-0.3.1.tar.gz
- Upload date:
- Size: 602.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b20894d3dac0e4acf7b1f524c7180b352f3600db7151e7657a4c123a7c7f8fb7
|
|
| MD5 |
10c7a825c42270c5f29e1d6d46675a77
|
|
| BLAKE2b-256 |
34e60f5b063c19911d7d7f44695a870bf5d0b10ae924882672524964484f7789
|
File details
Details for the file receipt_ocr-0.3.1-py3-none-any.whl.
File metadata
- Download URL: receipt_ocr-0.3.1-py3-none-any.whl
- Upload date:
- Size: 12.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
318fc6af4797af050eeaf136530d37beb177b2f866357e9da4beeb05332004b3
|
|
| MD5 |
58610998c2e00d19385478678dcdf165
|
|
| BLAKE2b-256 |
b1d9860d1f9fbc1d66630e2e23b142fd6b5c90e65e4d62b670276b164871272f
|