Simple CLI wrapper for Nougat OCR with GPU acceleration support
Project description
Nougat OCR CLI
Simple, batteries-included CLI wrapper for Nougat OCR with GPU acceleration.
Features
- GPU acceleration (CUDA & Apple Metal)
- Simple CLI interface
- Batch processing support
- Clean Markdown output
- Automatic model downloading
- Python API with type hints
Installation
From PyPI
pip install nougat-ocr-cli
From GitHub
pip install git+https://github.com/rubenffuertes/nougat-ocr-cli.git
From source
git clone https://github.com/rubenffuertes/nougat-ocr-cli.git
cd nougat-ocr-cli
uv pip install -e .
CLI Usage
# Basic usage - outputs to current directory
nougat-ocr-cli document.pdf
# Specify output directory
nougat-ocr-cli document.pdf -o output/
# Process specific pages (zero-indexed)
nougat-ocr-cli document.pdf --pages 0-5
nougat-ocr-cli document.pdf --pages 1,3,5,7
# Use smaller model for faster processing
nougat-ocr-cli document.pdf --model 0.1.0-small
# Use full precision (FP32) for better accuracy
nougat-ocr-cli document.pdf --full-precision
# Set batch size manually
nougat-ocr-cli document.pdf --batch-size 4
CLI Options
| Option | Description |
|---|---|
input |
Input PDF file to process |
-o, --output |
Output directory (default: current directory) |
--model |
Model version (default: 0.1.0-base) |
--batch-size |
Batch size for processing (auto-detected) |
--full-precision |
Use FP32 instead of BF16 |
--no-markdown |
Disable markdown post-processing |
--pages |
Page range (e.g., '0-5' or '1,3,5') |
Python API
from nougat_wrapper import NougatOCR
from pathlib import Path
# Initialize (loads model to GPU automatically)
ocr = NougatOCR()
# Extract text from PDF
result = ocr.extract_text(Path("paper.pdf"))
print(f"Extracted {result.pages} pages")
print(f"Failed pages: {result.placeholder_pages}")
print(result.text) # Markdown output
Advanced Usage
ocr = NougatOCR(
model_tag="0.1.0-small", # Use smaller model
batch_size=4, # Process 4 pages at once
full_precision=True, # Use FP32 instead of BF16
)
# Only OCR pages 0, 1, 2 (zero-indexed)
result = ocr.extract_text(pdf_path, pages=[0, 1, 2])
Requirements
- Python 3.11 only (3.12+ not supported due to nougat-ocr dependencies)
- GPU recommended (CUDA or Apple Metal)
- ~1.3 GB for model weights (auto-downloaded)
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
nougat_ocr_cli-0.1.3.tar.gz
(99.5 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nougat_ocr_cli-0.1.3.tar.gz.
File metadata
- Download URL: nougat_ocr_cli-0.1.3.tar.gz
- Upload date:
- Size: 99.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
865256425f175978c3828318ff9a5c5beddff7217a076ba31cd438ee5bc8c733
|
|
| MD5 |
cc3cbcdb526e7ce07c80f1c64eef885c
|
|
| BLAKE2b-256 |
a2019416c0e82466cff4ae00963c5030b56c802bc6e6ffce8f6d0c04a7527fff
|
File details
Details for the file nougat_ocr_cli-0.1.3-py3-none-any.whl.
File metadata
- Download URL: nougat_ocr_cli-0.1.3-py3-none-any.whl
- Upload date:
- Size: 8.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3da17bd00d4df61d46f789683418c5a1b7f45f98b7c50beb25a31a6d256ebe78
|
|
| MD5 |
3b3d7b7da6aac29498d47462a9d5ff39
|
|
| BLAKE2b-256 |
4b1b92f34896acd8ca5e4fd4f8e1d46f2ca1b6273c31ba32d3c4b4090b94ca1c
|