Skip to main content

Re-embed full (non-subset) fonts in PDFs.

Project description

unsubsetter

Re-embed full (non-subset) versions of fonts in PDFs.

PDF generators often subset embedded fonts, keeping only the glyphs a document uses and marking the font's PostScript name with a six-letter ABCDEF+ prefix. Some preflight checkers treat subsetted fonts as "not embedded" even though they are. unsubsetter swaps each subset for the complete font found on disk. It was built to get a book past Amazon KDP's preflight check, but applies to any PDF that needs non-subsetted embedded fonts.

It handles CID TrueType and CID CFF fonts. Other font types are detected and reported, but left unchanged — see Limitations.

Install

Install the unsubsetter command into an isolated environment on your PATH:

uv tool install unsubsetter

Or with pipx:

pipx install unsubsetter

From source

git clone https://github.com/saggingmeniscus/unsubsetter
cd unsubsetter
uv tool install .

Usage

Inspect (default — no writes):

unsubsetter book.pdf

Fix (writes book.unsubset.pdf by default):

unsubsetter --fix book.pdf

Filter to specific fonts:

unsubsetter --fix --only Garamond,Helvetica book.pdf

With visual verification (renders N random pages and pixel-diffs them):

unsubsetter --fix --verify-visual 10 book.pdf

Example: preparing a PDF for Amazon KDP

KDP's preflight check is the use case this tool was built for — it can reject a PDF whose fonts look un-embedded, which subsetting may trigger. A careful pre-upload pass:

  1. Inspect (default mode, no writes):

    unsubsetter interior.pdf
    

    Confirm the plan covers the font KDP flagged. Resolve any surprising SKIP lines — e.g. a font that can't be found on disk — first.

  2. Fix with visual sampling:

    unsubsetter --fix --verify-visual 10 interior.pdf
    

    This writes interior.unsubset.pdf and pixel-diffs 10 random pages against the original.

  3. Independent structural check:

    pdffonts interior.unsubset.pdf
    

    Confirm sub=no on every previously-subset CID TrueType or CID CFF font.

  4. Spot-check a few pages in a PDF viewer, paying attention to pages that use fonts the tool reported as skipped — those pass through unchanged and should look identical.

  5. Upload to KDP. If it flags a different font, re-run with --only THAT_FONT to test it in isolation, or report the issue.

Troubleshooting exit code 4

If unsubsetter --fix exits with code 4, the disk font on your system doesn't match the subset embedded in the PDF for one or more CFF fonts. The report names the offending fonts. Either:

  • Locate the matching font version and supply it via --font-path /path/to/font/dir; or
  • Re-run with --exclude FONT_NAME to leave that font alone (it'll stay subset in the output).

Limitations

V2 handles CID TrueType (CIDFontType2) and CID CFF (CIDFontType0) subsetted fonts. Simple Type 1 (/FontFile) and Type 1C (/FontFile3 /Subtype Type1C) are detected and reported, but left unchanged. If a preflight checker flags one of those, outlining (converting the affected glyphs to vector paths) is the usual workaround until those types are supported.

For CFF fonts, unsubsetter runs a glyph-correspondence check between the embedded subset and the disk full font. If they don't agree on glyph identity (e.g., your disk font is a different version than the one originally embedded), the run exits with code 4 and writes no output. Either supply the matching font via --font-path or skip that font with --exclude.

Development

Set up the project and run the test suite:

uv sync --extra dev
uv run pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unsubsetter-0.1.0.tar.gz (89.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

unsubsetter-0.1.0-py3-none-any.whl (21.3 kB view details)

Uploaded Python 3

File details

Details for the file unsubsetter-0.1.0.tar.gz.

File metadata

  • Download URL: unsubsetter-0.1.0.tar.gz
  • Upload date:
  • Size: 89.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for unsubsetter-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0b5a6236d9654a309bae8bd616159862e30b3bb9492a4d87c9966d1dc0f4b76a
MD5 176d482163b3fea0793773e0f769430d
BLAKE2b-256 83a80746a4ef43596bc4b3b7ec5a10a21ec2085b9ed7063c6c53754f5eeb8d05

See more details on using hashes here.

File details

Details for the file unsubsetter-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: unsubsetter-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 21.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for unsubsetter-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7ca757729e8369a1ee12233101f12f2d442f3851658317aecf5d7b982650d4e3
MD5 3958f0b473f8139558520eaecc3da429
BLAKE2b-256 a35a4b7f0b6c9dd8bcfcc77d8ab6d55f89bfe5d9f4ee1c71b663e1b28ce1d02a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page