Simple CLI wrapper for Nougat OCR with GPU acceleration support
Project description
Nougat OCR CLI
Simple, batteries-included CLI wrapper for Nougat OCR with GPU acceleration.
Features
- GPU acceleration (CUDA & Apple Metal)
- Simple CLI interface
- Batch processing support
- Clean Markdown output
- Automatic model downloading
- Python API with type hints
Installation
From PyPI
pip install nougat-ocr-cli
From GitHub
pip install git+https://github.com/rubenffuertes/nougat-ocr-cli.git
From source
git clone https://github.com/rubenffuertes/nougat-ocr-cli.git
cd nougat-ocr-cli
uv pip install -e .
CLI Usage
# Basic usage - outputs to current directory
nougat-ocr-cli document.pdf
# Specify output directory
nougat-ocr-cli document.pdf -o output/
# Process specific pages (zero-indexed)
nougat-ocr-cli document.pdf --pages 0-5
nougat-ocr-cli document.pdf --pages 1,3,5,7
# Use smaller model for faster processing
nougat-ocr-cli document.pdf --model 0.1.0-small
# Use full precision (FP32) for better accuracy
nougat-ocr-cli document.pdf --full-precision
# Set batch size manually
nougat-ocr-cli document.pdf --batch-size 4
CLI Options
| Option | Description |
|---|---|
input |
Input PDF file to process |
-o, --output |
Output directory (default: current directory) |
--model |
Model version (default: 0.1.0-base) |
--batch-size |
Batch size for processing (auto-detected) |
--full-precision |
Use FP32 instead of BF16 |
--no-markdown |
Disable markdown post-processing |
--pages |
Page range (e.g., '0-5' or '1,3,5') |
Python API
from nougat_wrapper import NougatOCR
from pathlib import Path
# Initialize (loads model to GPU automatically)
ocr = NougatOCR()
# Extract text from PDF
result = ocr.extract_text(Path("paper.pdf"))
print(f"Extracted {result.pages} pages")
print(f"Failed pages: {result.placeholder_pages}")
print(result.text) # Markdown output
Advanced Usage
ocr = NougatOCR(
model_tag="0.1.0-small", # Use smaller model
batch_size=4, # Process 4 pages at once
full_precision=True, # Use FP32 instead of BF16
)
# Only OCR pages 0, 1, 2 (zero-indexed)
result = ocr.extract_text(pdf_path, pages=[0, 1, 2])
Requirements
- Python 3.11+
- GPU recommended (CUDA or Apple Metal)
- ~1.3 GB for model weights (auto-downloaded)
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
nougat_ocr_cli-0.1.0.tar.gz
(201.0 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nougat_ocr_cli-0.1.0.tar.gz.
File metadata
- Download URL: nougat_ocr_cli-0.1.0.tar.gz
- Upload date:
- Size: 201.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
61dae9142a81693b0e84bebafefc4807a0a6d9cbb73cf4d1019521f4aea24c8c
|
|
| MD5 |
002c25425f7ef279b375a433ad903e05
|
|
| BLAKE2b-256 |
07d572165729dda0d7acc89cdf3e542351e7f4d2efe623c3a70b0b8a9ab01430
|
File details
Details for the file nougat_ocr_cli-0.1.0-py3-none-any.whl.
File metadata
- Download URL: nougat_ocr_cli-0.1.0-py3-none-any.whl
- Upload date:
- Size: 7.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
97984b55479f0f7b78ec461225432c945a86efd1be530990c785ea67394a1971
|
|
| MD5 |
e68ecefd4fc85e44a0a78a9e62346af5
|
|
| BLAKE2b-256 |
4326d46459e281c137b9b24196c130c647874df2a3da457a8112d2439181cd48
|