Skip to main content

Best Bangla/Bengali OCR - Transformer-based text recognition for handwritten & printed Bangla. Works on Colab, Kaggle, local. Zero dependency conflicts.

Project description

Bangla OCR 🇧🇩

The Best Bangla/Bengali OCR Package - Recognize Bangla text from images with zero dependency conflicts!

PyPI version Python 3.8+ License: MIT Downloads

✨ Why Bangla OCR?

  • 🔤 Accurate: State-of-the-art Transformer architecture for Bangla text recognition
  • 🚀 Zero Conflicts: Works with ANY version of PyTorch, NumPy - no dependency hell!
  • ☁️ Cloud Ready: Works instantly on Google Colab, Kaggle, AWS, GCP, Azure
  • 💻 Cross-Platform: Windows, Linux, macOS, Apple Silicon (M1/M2/M3/M4)
  • 🎯 Smart Fallbacks: Works even without OpenCV (uses PIL as fallback)
  • 📥 Multiple Inputs: Accepts file paths, PIL Images, numpy arrays, URLs, and bytes
  • Fast: GPU acceleration with CUDA and Apple MPS support

📦 Installation

pip install bangla-ocr

That's it! Works everywhere - Colab, Kaggle, local, Docker, cloud!

Optional: With OpenCV (slightly better preprocessing)

pip install bangla-ocr[cv-headless]  # Recommended for servers/Colab/Kaggle
pip install bangla-ocr[cv]           # For local development with GUI
pip install bangla-ocr[full]         # Everything included

🚀 Quick Start

from bangla_ocr import BanglaOCR

# Initialize (automatically downloads model on first run)
ocr = BanglaOCR()

# Recognize text from an image
text = ocr.recognize('bangla_image.jpg')
print(text)  # Output: বাংলা টেক্সট

📖 Usage Examples

From File Path

text = ocr.recognize('path/to/image.jpg')

From PIL Image

from PIL import Image
img = Image.open('image.png')
text = ocr.recognize(img)

From NumPy Array

import numpy as np
arr = np.array(Image.open('image.jpg'))
text = ocr.recognize(arr)

From URL

text = ocr.recognize('https://example.com/bangla_image.jpg')

From Bytes

with open('image.jpg', 'rb') as f:
    img_bytes = f.read()
text = ocr.recognize(img_bytes)

Batch Prediction

texts = ocr.predict_batch(
    ['img1.jpg', 'img2.jpg', 'img3.jpg'],
    show_progress=True
)
for text in texts:
    print(text)

🔧 Advanced Configuration

Check System Status

from bangla_ocr import check_dependencies, print_info

# Print formatted info
print_info()

# Get dependency status
status = check_dependencies()
print(status)
# {'torch': True, 'cuda': True, 'mps': False, 'opencv': True, ...}

Device Selection

# Auto-detect best device (default)
ocr = BanglaOCR()

# Force specific device
ocr = BanglaOCR(device='cpu')   # CPU
ocr = BanglaOCR(device='cuda')  # NVIDIA GPU
ocr = BanglaOCR(device='mps')   # Apple Silicon

Custom Model Parameters

ocr = BanglaOCR(
    model_path='custom_model.pth',
    tokenizer_path='tokenizer.pk',
    d_model=256,           # Hidden dimension
    nheads=4,              # Attention heads
    num_decoder_layers=4   # Decoder layers
)

Get Model Info

ocr = BanglaOCR()
print(ocr.info)
# {'version': '1.0.0', 'device': 'cuda', 'd_model': 256, ...}

🏗️ Architecture

Input Image → ResNet-18 (CNN Encoder) → Transformer Decoder → Bangla Text
  • Encoder: ResNet-18 backbone for visual feature extraction
  • Positional Encoding: Sine encodings for spatial information
  • Decoder: Transformer with multi-head attention
  • Output: Character-level Bangla vocabulary

📋 Requirements

Package Version Notes
Python ≥ 3.8 3.9-3.12 recommended
PyTorch ≥ 1.9.0 Any version works
torchvision ≥ 0.10.0 Any version works
NumPy ≥ 1.19.0 Any version works
Pillow ≥ 8.0.0 Always used
OpenCV Optional PIL fallback available

🌍 Platform Compatibility

Platform Status Acceleration
Google Colab ✅ Perfect CUDA
Kaggle ✅ Perfect CUDA/TPU
Windows 10/11 ✅ Perfect CUDA
Linux (Ubuntu, etc.) ✅ Perfect CUDA
macOS Intel ✅ Perfect CPU
macOS Apple Silicon ✅ Perfect MPS
Docker ✅ Perfect CUDA
AWS/GCP/Azure ✅ Perfect CUDA

🏋️ Training

Install training dependencies:

pip install bangla-ocr[train]

See the training guide for details.

📚 Citation

@inproceedings{ghosh2023bangla,
  title={Towards Full-page Offline Bangla Handwritten Text Recognition},
  author={Ghosh, A.},
  booktitle={IEEE SILCON},
  year={2023}
}

📄 License

MIT License - see LICENSE for details.

🤝 Contributing

Contributions welcome! Please submit a Pull Request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bangla_ocr-1.0.3.tar.gz (18.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bangla_ocr-1.0.3-py3-none-any.whl (17.6 kB view details)

Uploaded Python 3

File details

Details for the file bangla_ocr-1.0.3.tar.gz.

File metadata

  • Download URL: bangla_ocr-1.0.3.tar.gz
  • Upload date:
  • Size: 18.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for bangla_ocr-1.0.3.tar.gz
Algorithm Hash digest
SHA256 935f24eab9bdcbb13bb8009ad646393ad675394ea687e86f453984f14029565c
MD5 2721feb27b25b033921ec1950bd68ec3
BLAKE2b-256 e2e051527ac7b7a4367c35dc06ef6e2a4e9897ba5d57b44727633e3f4ef2c7fd

See more details on using hashes here.

File details

Details for the file bangla_ocr-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: bangla_ocr-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 17.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for bangla_ocr-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 c1eb6d213608f162e57db795105f71bb0c9e06f59d9b8ee1bf3bf49b7010940a
MD5 d4af36eb157df31ba487710baee813fd
BLAKE2b-256 dd97df0d8de4e4966ad607f2f2e646ede41799f709b27720ae2500aeec61b649

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page