Self-contained Swiss Army knife for document and media manipulation
Project description
Cropsmith
A cross-platform Swiss Army knife for document and media manipulation. Capture web page regions as PDF, compress PDFs and videos, combine PDFs, convert PDF to DOCX, and run OCR -- all from a single CLI command available system-wide.
Features
- Web Capture -- screenshot a web page within a user-defined bounding box and export as PDF
- Page Capture -- interactively draw a box over any reader, auto-turn pages, and OCR them into a searchable PDF
- PDF Compression -- reduce PDF file size while preserving quality
- PDF Combine -- merge multiple PDFs into a single file
- PDF to Word -- convert PDF files to editable Word (.docx) documents
- Video Compression -- compress video files using configurable quality settings
- Text Extraction (OCR) -- extract text from images or scanned PDFs
- Right-click integration -- run the file tools straight from the file manager's right-click menu (macOS + Windows)
Commands at a glance
Cropsmith uses plain-language command names. Older/alternative names still work as aliases.
| Command | Aliases | What it does |
|---|---|---|
web-to-pdf |
capture |
Save a web page region as a PDF |
capture-pages |
scan, page-turner |
Capture a screen region across pages into a (searchable) PDF |
shrink-pdf |
compress-pdf |
Compress a PDF |
merge-pdf |
combine, merge |
Combine PDFs into one |
pdf-to-word |
pdf2docx, pdf-to-docx |
Convert a PDF to Word (.docx) |
shrink-video |
compress-video |
Compress a video |
extract-text |
ocr |
Pull text out of an image or scanned PDF |
Requirements
- Python 3.11+
- pip (comes with Python)
- Platform: macOS, Linux, or Windows (WSL recommended on Windows for full feature parity)
System dependencies
None. Cropsmith is self-contained -- OCR, PDF compression and video
compression all run from bundled Python wheels (RapidOCR, PyMuPDF, and an ffmpeg
binary shipped inside imageio-ffmpeg). No ffmpeg, tesseract or
ghostscript install required.
The only exception is web-to-pdf, which downloads a Chromium browser once via
playwright install chromium the first time you use it.
Installation
Download the app (no setup)
The easiest way -- no Python, no terminal, nothing to install. Grab the build for your OS from the latest release:
| OS | Download |
|---|---|
| macOS | Cropsmith-macos.zip -- unzip and run |
| Windows | Cropsmith-windows.zip -- unzip and run |
| Linux | Cropsmith-linux.zip -- unzip and run |
These bundles include everything (Python runtime + all engines). Nothing else to
install. Polished installers (.dmg / .exe / .AppImage) and right-click menu
integration are on the roadmap.
For developers (pipx / pip)
pipx install cropsmith # once published to PyPI; works on macOS/Linux/Windows
See packaging/ for details.
From source
1. Clone the repo
git clone https://github.com/youruser/cropsmith.git
cd cropsmith
2. Create and activate a virtual environment
python3 -m venv .venv
# macOS / Linux
source .venv/bin/activate
# Windows (PowerShell)
.venv\Scripts\Activate.ps1
3. Install the package in editable mode
This is the key step that makes cropsmith callable from anywhere without activating the venv manually:
pip install -e .
4. Install Playwright browser
playwright install chromium
5. Verify
cropsmith --help
Usage
Web capture to PDF
Capture a bounding box region of a web page and save as PDF.
cropsmith web-to-pdf --url "https://example.com" --box 100,200,800,600 --output capture.pdf
--box format is x1,y1,x2,y2 in pixels relative to the rendered page.
Capture pages from your screen (interactive)
Draw a box over any on-screen reader (browser, Kindle, PDF viewer), then Cropsmith captures each page -- auto-pressing a key to turn pages -- and OCRs them into one searchable PDF.
cropsmith capture-pages --output book.pdf
It will let you drag a selection box, then prompt for which key turns the page, how many pages, and the interval. Or pass everything up front:
cropsmith capture-pages -o book.pdf --box 200,150,900,1200 --key right --pages 40 --interval 1.2
The selection overlay is dismissed before capture begins, so it never appears in the output. On macOS, grant your terminal Screen Recording and Accessibility permissions (System Settings > Privacy & Security) the first time.
Shrink (compress) a PDF
cropsmith shrink-pdf input.pdf --output compressed.pdf --level screen
Levels: screen (smallest), ebook, printer, prepress (highest quality)
Merge PDFs
cropsmith merge-pdf file1.pdf file2.pdf file3.pdf --output combined.pdf
PDF to Word
cropsmith pdf-to-word input.pdf --output output.docx
Shrink (compress) a video
cropsmith shrink-video input.mp4 --output compressed.mp4 --crf 28
CRF range: 18 (high quality) to 51 (smallest file). Default: 28.
Extract text (OCR)
Extract text from an image or scanned PDF:
cropsmith extract-text input.png --output extracted.txt
cropsmith extract-text scanned.pdf --output extracted.txt
Right-click menu (macOS & Windows)
Add Cropsmith's file tools to your file manager's right-click menu:
cropsmith install-menu # cropsmith uninstall-menu to remove
Then right-click a file:
| File | Actions |
|---|---|
| Shrink PDF, PDF → Word, Extract Text | |
| Image | Extract Text |
| Video | Compress Video |
| Folder | Merge PDFs (every PDF in the folder → merged.pdf) |
Output is written next to the source (foo-min.pdf, foo.docx, foo.txt,
foo-compressed.mp4). Actions are scoped by file type, so PDF tools only show
on PDFs.
- macOS -- Finder > Quick Actions. Shows a notification when done. (Merge: select multiple PDFs.)
- Windows -- right-click a file directly (Windows 10) or under "Show more options" (Windows 11). Per-user, no admin. (Merge: right-click a folder, or inside it, to merge every PDF in that folder.)
- Linux -- on the roadmap (varies by desktop environment).
The interactive tools (
web-to-pdf,capture-pages) stay CLI-only.
How the global command works
Cropsmith uses a pyproject.toml entry point to register the CLI command at install time:
[project.scripts]
cropsmith = "cropsmith.cli:main"
When you run pip install -e . inside your virtual environment, pip writes a cropsmith executable into the venv's bin/ (or Scripts/ on Windows) directory. If that venv is active or its bin path is on your PATH, you can call cropsmith from any directory.
For a permanent global install without activating the venv each time, use pipx:
pipx install .
pipx manages an isolated environment automatically and puts the command on your system PATH permanently.
Project structure
cropsmith/
cropsmith/
__init__.py
cli.py # Entry point, command parsing, friendly aliases
capture.py # Web capture via Playwright
pdf_tools.py # Compress, combine, convert PDF
video_tools.py # Video compression via ffmpeg
ocr.py # OCR via Tesseract (+ PyMuPDF for PDFs)
pyproject.toml
README.md
pyproject.toml (starter)
[build-system]
requires = ["setuptools>=68", "wheel"]
build-backend = "setuptools.build_meta"
[project]
name = "cropsmith"
version = "0.1.0"
description = "Swiss Army knife for document and media manipulation"
requires-python = ">=3.11"
dependencies = [
"playwright",
"pypdf",
"pdf2docx",
"rapidocr-onnxruntime",
"imageio-ffmpeg",
"Pillow",
"PyMuPDF",
"click",
"mss",
"pynput",
]
[project.scripts]
cropsmith = "cropsmith.cli:main"
Recommended global install workflow (any platform)
# Install pipx if you don't have it
pip install pipx
pipx ensurepath
# Install cropsmith globally
cd /path/to/cropsmith
pipx install .
# Now callable from anywhere, no venv activation needed
cropsmith --help
Platform notes
| Platform | Status | Notes |
|---|---|---|
| macOS | Full support | No system deps -- everything is bundled |
| Linux | Full support | No system deps -- everything is bundled |
| Windows | Full support | No system deps -- everything is bundled |
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cropsmith-0.3.0.tar.gz.
File metadata
- Download URL: cropsmith-0.3.0.tar.gz
- Upload date:
- Size: 23.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0e12a67e42f08849163530ed463e2d7f9b2196a72a290da22d3dced20e20355a
|
|
| MD5 |
fec69aaa5d45fdf79faf30f1c44b35e4
|
|
| BLAKE2b-256 |
7f27a524fcf86f1cafbc827a2c31eb663003f18565489ac680af6ef714b61d32
|
Provenance
The following attestation bundles were made for cropsmith-0.3.0.tar.gz:
Publisher:
release.yml on opieeipo/cropsmith
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cropsmith-0.3.0.tar.gz -
Subject digest:
0e12a67e42f08849163530ed463e2d7f9b2196a72a290da22d3dced20e20355a - Sigstore transparency entry: 1758360737
- Sigstore integration time:
-
Permalink:
opieeipo/cropsmith@e444c7a844e2bc2efb30c0364f7596bee1bf95f5 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/opieeipo
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@e444c7a844e2bc2efb30c0364f7596bee1bf95f5 -
Trigger Event:
push
-
Statement type:
File details
Details for the file cropsmith-0.3.0-py3-none-any.whl.
File metadata
- Download URL: cropsmith-0.3.0-py3-none-any.whl
- Upload date:
- Size: 23.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c5f6b5051386a5950765dcff4938ea5068d778ca9f0f348fa1cc07d72a5bda90
|
|
| MD5 |
e29e8458f745752a182fa02144da65a2
|
|
| BLAKE2b-256 |
9cc587e982cd9300ab8b72fe93b569ee5743ad1a6d63a7031db5e66179bd0431
|
Provenance
The following attestation bundles were made for cropsmith-0.3.0-py3-none-any.whl:
Publisher:
release.yml on opieeipo/cropsmith
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cropsmith-0.3.0-py3-none-any.whl -
Subject digest:
c5f6b5051386a5950765dcff4938ea5068d778ca9f0f348fa1cc07d72a5bda90 - Sigstore transparency entry: 1758360754
- Sigstore integration time:
-
Permalink:
opieeipo/cropsmith@e444c7a844e2bc2efb30c0364f7596bee1bf95f5 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/opieeipo
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@e444c7a844e2bc2efb30c0364f7596bee1bf95f5 -
Trigger Event:
push
-
Statement type: