CLI/TUI tool to run OCR locally and overlay a searchable text layer on PDFs.
Project description
pdfembed
CLI/TUI tool to run OCR locally and overlay a searchable text layer on PDFs. By default, a Textual-based TUI launches; use --cli for the classic CLI.
License: BSD-3-Clause (see LICENSE).
Quickstart
-
TUI (default):
python -m pdfembed.cliorpdfembed -
CLI:
python -m pdfembed.cli --cli --file sample.pdf --dpi 300
TUI Controls
f: select PDF file(s) (opens a file dialog; multiple selection allowed)o: select output folder (opens a folder dialog; defaults to the first PDF's directory)v: toggle overlay visibility (debug)s: start OCRq: quit- DPI is fixed to the default in TUI; change via CLI
--dpiif needed.
While OCR is running, a "Processing... please wait" indicator is shown and other keys are ignored until completion.
CLI Options (key ones)
--file <pdf1> [pdf2 ...]or--dir <folder>: input PDFs--output <dir>: output directory (default: input location)--dpi <int>: render DPI (default 300)--visible: make overlay text visible (debug)--font <path>: TTF font for overlay text--log-level <LEVEL>: logging level (INFO by default)
Dependencies
- Textual (TUI)
- tkinter (file dialogs, stdlib)
- onnxocr / pypdfium2 / pypdf / reportlab / opencv-python / numpy
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pdfembed-0.1.5.tar.gz.
File metadata
- Download URL: pdfembed-0.1.5.tar.gz
- Upload date:
- Size: 4.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f7b0cf09343b879c8d3eaf7eddfef5107c41c95f287c97377bb2e3237d931568
|
|
| MD5 |
3ccb7f00c3eb43e6420e9539f1e12b0a
|
|
| BLAKE2b-256 |
fd14cb7d138737e4b6bbcfecbfebf9bba5f938007624606ae2a44b65db9b32bc
|
File details
Details for the file pdfembed-0.1.5-py3-none-any.whl.
File metadata
- Download URL: pdfembed-0.1.5-py3-none-any.whl
- Upload date:
- Size: 4.2 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a1a256898793b17aed21f436ca41a3c35e1dac73adb1c1bcda8673802e4f878d
|
|
| MD5 |
7df79c65d86fabb9903e81f4cdfacf9c
|
|
| BLAKE2b-256 |
fcd01f1f316c0fc58ecfe6ac4abf3a0ecbb01dbda687ba9bdfbc79d516b0f27e
|