Local-first PDF compressor: shrink PDFs below a target size without sending them to any third-party service.
Project description
slimpdf
Get a PDF under a size limit (say 3 MB for an upload form) without sending it to some website. slimpdf is a command-line tool and Python library that compresses PDFs entirely on your own machine — no uploads, no account, no network calls.
pip install slimpdf
slimpdf compress big-scan.pdf --target 3mb -o small.pdf
Why this exists
I kept hitting the same wall at work: a scanned hospital bill or KYC document would be 15–20 MB, the upload form capped at 3 MB, and the only quick fix was to drop the file into an online compressor. For medical and identity documents, handing them to a random website is exactly what you don't want to do.
Command-line options exist, but they have rough edges. Ghostscript's presets are
a blunt instrument — /screen either over-compresses into mush or doesn't get
small enough, and there's no "just get it under 3 MB" mode. It's also AGPL, so
you can't bundle it into a product. slimpdf is the tool I wanted instead:
- It runs locally. Your files never leave the machine. No telemetry either.
- It aims at a size, not a vague preset. Tell it
--target 3mband it works out how much compression is actually needed, so it doesn't throw away quality it didn't have to. - It won't quietly wreck a file. Every result is re-opened and checked (page count, text still extractable) before it's kept, and if it can't beat the original it leaves the file alone instead of writing a bigger one.
- MIT, with permissive dependencies only. No AGPL anywhere, so you can drop it into closed-source software without a lawyer conversation.
On a real set of scanned documents it held its own against Ghostscript — similar size reduction at noticeably better visual fidelity. Numbers and method are in docs/benchmark-results.md.
Install
pip install slimpdf
# or, from source with uv:
uv sync --extra dev
CLI
# Compress below 3 MB using the claim-upload preset
slimpdf compress input.pdf --target 3mb -o output.pdf --report report.json
# Inspect a PDF (read-only): pages, images, text layer, encryption
slimpdf inspect input.pdf --json
# Batch a folder and emit a benchmark table
slimpdf batch ./samples --target 3mb --out ./compressed --csv bench.csv --md bench.md
# Compare slimpdf against other engines on size AND quality (SSIM)
pip install "slimpdf[benchmark]" # adds SSIM scoring
slimpdf compare ./samples --out ./out --min-mb 3 --csv compare.csv
compare picks up Ghostscript and mutool if they're on your PATH and scores
each engine on size, visual fidelity (SSIM) and text retention, so you're not
just trusting that smaller is better. There's a real run in
docs/benchmark-results.md.
Exit code is 2 when a target was requested but not achieved — handy in scripts.
Presets
| preset | target | max DPI | quality (start→min) | rasterize |
|---|---|---|---|---|
claim-upload |
3 MB | 150 | 75 → 55 | off |
screen |
— | 120 | 70 → 45 | off |
archive |
— | 200 | 85 → 70 | off |
Override any knob: --max-dpi, --quality, --min-quality, --target,
--allow-rasterize, --keep-metadata, --password, --force-output.
Python API
from slimpdf import compress, inspect, CompressOptions
info = inspect("input.pdf")
print(info.page_count, info.images_found, info.text_layer_detected)
result = compress(
"input.pdf", "output.pdf",
CompressOptions(preset="claim-upload", target_bytes=3_000_000),
)
print(result.compressed_size_bytes, result.target_achieved, result.mode_used)
How it works
There are three compression passes. slimpdf tries them least-destructive first and keeps the gentlest one that gets you under the target:
- Structural — recompress streams, pack objects, drop junk metadata. Fully lossless; sometimes enough on its own for a bloated-but-not-scanned PDF.
- Image rewrite — the main worker. Downsamples and re-encodes the oversized embedded images, binary-searching JPEG quality and DPI until it hits the target. Text, forms and vector graphics are left intact.
- Raster fallback — off unless you pass
--allow-rasterize. Flattens each page to an image and rebuilds the PDF. It'll hit almost any size, but you lose selectable text, form fields and signatures, so it's a last resort.
Demo
The animation at the top (docs/demo.gif) is generated from a real slimpdf run
on a synthetic file — regenerate it any time with:
uv run python docs/make_demo.py # writes docs/demo.gif
(A VHS tape, docs/demo.tape, is also
included if you prefer recording a live terminal.)
Notes
It really is offline. slimpdf makes no network calls and writes nothing outside the paths you give it. Safe to run on documents you wouldn't upload.
What if it can't hit the target? It returns the smallest valid result it
found and reports target_achieved: false (CLI exit code 2). It never ships a
file larger than the input unless you pass --force-output.
Scanned PDFs with no text layer compress on image content alone; there's nothing to preserve text-wise, so they tend to shrink the most.
Licensing
slimpdf is MIT. All runtime dependencies are permissive (no copyleft):
| dependency | license | role |
|---|---|---|
pikepdf |
MPL-2.0 | parsing + structural rewrite (bundles qpdf, Apache-2.0) |
pypdfium2 |
BSD-3/Apache | page rendering (bundles PDFium, BSD-3) |
Pillow |
MIT-CMU (HPND) | image encode/decode |
So you can use it inside a proprietary product without copyleft obligations. PyMuPDF and Ghostscript would both have been easier in places, but they're AGPL, which is the whole reason they're not here.
Status
Alpha — works well on the documents I've thrown at it, but it hasn't seen the long tail yet. Two things to know: any rewrite invalidates a PDF's digital signature (that's unavoidable when you change the bytes), and a few unusual image filters/colorspaces are skipped with a warning rather than risk producing a broken file. Bug reports with a sample PDF are very welcome.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file slimpdf-0.1.0.tar.gz.
File metadata
- Download URL: slimpdf-0.1.0.tar.gz
- Upload date:
- Size: 157.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
077048688345c29a8cb43aebf22d8241d856381ea1831d63d0faedd692e0f45a
|
|
| MD5 |
59a105bd325ee9da3d41b6d815d7143f
|
|
| BLAKE2b-256 |
5e6ed40d71ff2d77f6608267ab8387351868dbcfedb2da727b77d4fc3e70f69b
|
Provenance
The following attestation bundles were made for slimpdf-0.1.0.tar.gz:
Publisher:
release.yml on thisis-gp/slimpdf
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
slimpdf-0.1.0.tar.gz -
Subject digest:
077048688345c29a8cb43aebf22d8241d856381ea1831d63d0faedd692e0f45a - Sigstore transparency entry: 1869958300
- Sigstore integration time:
-
Permalink:
thisis-gp/slimpdf@dea984af9e4298d548443bb0825c1e758f037671 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/thisis-gp
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@dea984af9e4298d548443bb0825c1e758f037671 -
Trigger Event:
push
-
Statement type:
File details
Details for the file slimpdf-0.1.0-py3-none-any.whl.
File metadata
- Download URL: slimpdf-0.1.0-py3-none-any.whl
- Upload date:
- Size: 26.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1f1d7cd3115e4afb2e0724215a0c71f172d5a2a24378ef2a25ecf290e68acdd6
|
|
| MD5 |
163549c86712c9dab8362656fa92dff4
|
|
| BLAKE2b-256 |
ec99d87fb5cd44e279afacf67c60b0f3dbf09ef76909fe1604b740c8b9546316
|
Provenance
The following attestation bundles were made for slimpdf-0.1.0-py3-none-any.whl:
Publisher:
release.yml on thisis-gp/slimpdf
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
slimpdf-0.1.0-py3-none-any.whl -
Subject digest:
1f1d7cd3115e4afb2e0724215a0c71f172d5a2a24378ef2a25ecf290e68acdd6 - Sigstore transparency entry: 1869958394
- Sigstore integration time:
-
Permalink:
thisis-gp/slimpdf@dea984af9e4298d548443bb0825c1e758f037671 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/thisis-gp
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@dea984af9e4298d548443bb0825c1e758f037671 -
Trigger Event:
push
-
Statement type: