Skip to main content

High-accuracy OCR library with intelligent auto-optimization

Project description

📝 Yomito OCR

High-accuracy OCR library with intelligent auto-optimization for Python, powered by Tesseract OCR.


🚀 Features

  • 🌍 Intelligent Language Detection — automatic with multiple precision levels
  • 🔤 Multi-Language Support — works with all Tesseract-compatible languages
  • 🎚 Flexible Modesall, auto, auto_fast, or manual language selection
  • 🎯 High Accuracy — preprocessing & optimization for better recognition
  • Easy Integration — simple Python API, ready to use

📦 Installation

pip install yomito

⚙️ System Requirements

You need Tesseract OCR installed:

🐧 Ubuntu/Debian

sudo apt-get install tesseract-ocr
sudo apt-get install tesseract-ocr-eng tesseract-ocr-rus tesseract-ocr-jpn

🍎 macOS

brew install tesseract
brew install tesseract-lang

🪟 Windows

👉 Download installer: UB Mannheim builds


🧪 Quick Start

import yomito

# Auto language detection
text = yomito.recognize_text('image.png')
print(text)

# Recommended: use specific language
text = yomito.recognize_text('image.png', lang='eng')
print(text)

🔤 Language Modes

Yomito supports 4 modes:

  1. all – use all installed languages

    text = yomito.recognize_text('image.png', lang='all')
    
  2. auto – precise auto-detection

    text = yomito.recognize_text('image.png', lang='auto')
    
  3. auto_fast – quick but less precise

    text = yomito.recognize_text('image.png', lang='auto_fast')
    
  4. Specific – one or more languages explicitly

    text = yomito.recognize_text('image.png', lang='eng+rus+jpn')
    

⚡ Advanced Usage

Custom Language List for Auto-Detection

from yomito import YomitoOCR

ocr = YomitoOCR(auto_languages=['eng', 'rus', 'deu'])
text = ocr.recognize('image.png', lang='auto')

Detailed Recognition with Metadata

result = ocr.recognize_detailed('image.png', lang='all')
print(result.text, result.mean_conf, result.tess_args.lang)

Custom Tesseract Config

from yomito.ocr import PSM, OEM

ocr = YomitoOCR(
    tesseract_path='/usr/bin/tesseract',
    tessdata_path='/usr/share/tessdata',
    default_lang='eng'
)

text = ocr.recognize('image.png', lang='eng', psm=PSM.SINGLE_BLOCK)

📖 API Reference

🔹 recognize_text(image, lang='auto', **kwargs)

Quick OCR function.

  • image: path, PIL Image, or numpy array
  • lang: 'all', 'auto', 'auto_fast', or specific codes
  • returns: str

🔹 YomitoOCR class

Full control API.

  • recognize(image, lang=None, **kwargs)str
  • recognize_detailed(image, lang=None, **kwargs) → result object
  • get_available_languages() → list of languages

⚙️ Performance Tips

  1. ✅ Use all if unsure about the language
  2. ⚡ Use specific languages for speed
  3. 🚀 Use auto_fast for real-time use cases
  4. 🖼 Preprocess images (resize, contrast) for better OCR
  5. 📥 Install the required language packs

🌐 Language Support

Yomito works with all installed Tesseract languages, e.g.:

  • eng – English
  • rus – Russian
  • deu – German
  • fra – French
  • jpn – Japanese
  • chi_sim / chi_tra – Chinese
  • ara – Arabic

Check available languages:

from yomito import YomitoOCR

ocr = YomitoOCR()
print(ocr.get_available_languages()) # ['ces', 'eng', 'jpn', 'osd', 'rus']

⚠️ Error Handling

try:
    text = yomito.recognize_text('image.png', lang='eng')
except FileNotFoundError:
    print('Tesseract not found. Install it first.')
except ValueError as e:
    print(f'Image error: {e}')
except Exception as e:
    print(f'OCR error: {e}')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yomito-1.3.0.tar.gz (12.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yomito-1.3.0-py3-none-any.whl (11.6 kB view details)

Uploaded Python 3

File details

Details for the file yomito-1.3.0.tar.gz.

File metadata

  • Download URL: yomito-1.3.0.tar.gz
  • Upload date:
  • Size: 12.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for yomito-1.3.0.tar.gz
Algorithm Hash digest
SHA256 812df1cbfde8b064bd9a1e597d884fca9b2521f8599ef001c582de5c9530651b
MD5 dc6f075f74fad40662d0c9d8893dd296
BLAKE2b-256 78b1a74e0ddc2bd07944c1654a0b0b25e0b731b8e21093db99d5a8468ccb7ec3

See more details on using hashes here.

File details

Details for the file yomito-1.3.0-py3-none-any.whl.

File metadata

  • Download URL: yomito-1.3.0-py3-none-any.whl
  • Upload date:
  • Size: 11.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for yomito-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 45d713e05ccb006c490f404ff4867906f796b82fbd4b80e086cc77dc9ba793ba
MD5 2e5299d24c012776b092dd5e6d486b1a
BLAKE2b-256 ab90b7a58e8323b9ed01a7047eb7627882ff0b5d6038592e6a63fa2f0a2f2797

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page