Skip to main content

Detection-only PDF preflight engine

Project description

LintPDF

Detection-only PDF preflight engine — analyze packaging, label, and commercial-print PDFs against 500+ checks (image DPI, total area coverage, bleed, fonts, barcodes, color spaces, conformance, AI-assisted regulatory rules, …) and surface findings as structured JSON, HTML reports, or annotated PDFs.

License: AGPL v3 Python: 3.11+

LintPDF is the OSS preflight engine that powers the hosted SaaS at lintpdf.com. The hosted product layers multi-tenancy, billing, white-label reports, and an admin console on top of this engine; the OSS package is the engine plus a narrow HTTP surface for submitting jobs and fetching results. You can self-host the OSS engine standalone.

This repository is licensed under the GNU Affero General Public License v3.0 or later — see the licensing notes below for what that means in practice when you embed or modify the engine.


Table of contents


What it does

LintPDF answers one question: "is this PDF print-ready?"

You POST a file. The engine analyzes it across 500+ checks grouped into categories (image quality, color, fonts, packaging, barcodes, regulatory compliance, conformance, …) and returns:

  • A verdictpass, pass_with_warnings, or fail.
  • A list of findings — each with an inspection id, severity, page number, bounding box, and a human-readable message.
  • A rendered report — HTML, PDF, JSON, or annotated PDF.
  • A viewer payload — separations, TAC heatmap, font list, layer toggles for the embedded React viewer (@printwithsynergy/loupe-pdf).

The engine ships built-in profiles for GWG 2022 (sheetfed + digital), PDF/X-4, and packaging — and supports custom rulesets authored as JSON. AI-assisted features (Claude-driven audit, explanations, dieline detection, regulatory checks) are optional and self-skip cleanly when no AI inference service is configured.

For the deeper architectural picture see docs/ARCHITECTURE.md.


Quick start (Docker)

# Clone + boot the full stack (engine + Postgres + Redis + ClamAV)
git clone https://github.com/printwithsynergy/lint-pdf.git
cd lint-pdf
docker compose up -d

# Wait for /ready to return 200
curl http://localhost:8000/ready
# {"status":"ok","database":"connected","redis":"connected"}

The compose stack is a single-node OSS deploy with Celery worker + beat + ClamAV sidecar. For production / HA topology see docs/DEPLOYMENT.md.


Quick start (Python)

# 3.11+ required
pip install lintpdf

# Or pin to a specific git ref:
#   pip install "lintpdf @ git+https://github.com/printwithsynergy/lint-pdf.git@main"

# Minimum env (production refuses to boot without these)
export LINTPDF_SAAS_MODE=false
export LINTPDF_SECRET_KEY=$(openssl rand -hex 32)
export LINTPDF_DATABASE_URL=postgresql://user:pass@localhost/lintpdf
export LINTPDF_REDIS_URL=redis://localhost:6379/0

# Boot the API
uvicorn lintpdf.api.app:create_app --factory --host 0.0.0.0 --port 8000

A complete environment-variable reference and the OSS-mode hard fails (production secret key + CORS wildcard) live in docs/DEPLOYMENT.md.


Submit your first PDF

# 1. Submit
curl -X POST http://localhost:8000/api/v1/jobs \
  -F "file=@artwork.pdf" \
  -F "profile_id=lintpdf-default"
# { "job_id": "job_abc…", "status": "queued" }

# 2. Poll
curl http://localhost:8000/api/v1/jobs/job_abc…
# { "id": "job_abc…", "status": "completed", "verdict": "pass_with_warnings", … }

# 3. One-call snapshot (job + reports + annotations + verdicts)
curl http://localhost:8000/api/v1/jobs/job_abc…/state | jq .

The OSS engine boots without multi-tenant auth out of the box — see docs/DEPLOYMENT.md#auth-in-oss-mode to wire in your own auth (single-user, OIDC, basic auth, or a custom tenant resolver).


Documentation

Doc Covers
docs/ARCHITECTURE.md Component layout, request flow, the three-scope toggle cascade, snapshots, AI tier model.
docs/DEPLOYMENT.md Self-hosting reference: env vars, services, Docker / Railway / single-node, OSS-mode toggle, security gates, backups.
docs/EXTENDING.md Service overrides (email / entitlements / billing / auth) and analyzer plugin authoring quick reference.
docs/plugin-api.md Full plugin Protocol reference — manifest fields, AnalyzerContext, banned imports, capability providers.
docs/CONTRIBUTING.md Dev environment setup, test conventions, commit / PR style, the engine-purity tripwire.
docs/audit-phase1.md Engineering record of the Phase 1 plugin-protocol refactor (background reading).

The hosted product's customer-facing docs (workflows, rulesets, brand profiles, integrations) live at lintpdf.com/docs.


Licensing

LintPDF is licensed under the GNU Affero General Public License v3.0 or later (AGPL-3.0+).

What that means in practice:

  • Self-host for any use (commercial or otherwise) — no fee, no per-tenant cap, no notify-us clause. Run it on your own infra and ship reports to your customers.
  • Modifications must be made available under the same AGPL-3.0+ license to anyone who interacts with your modified version including over a network (this is the "A" in AGPL — the network-use trigger). If you patch the engine and run it as a hosted service, your patches are AGPL.
  • Commercial / proprietary use without disclosure — contact Think Neverland LLC about a commercial license. The hosted SaaS at lintpdf.com runs under such a commercial license arrangement with itself; the engine you're reading is the same code, just under different licensing terms when you pay for the hosted / embedded option.

Copyright © 2024–2026 Think Neverland LLC.

Third-party dependencies retain their own licenses — see docs/CONTRIBUTING.md#third-party-licenses for the inventory.


Contributing

We accept patches via pull request. Before opening a PR:

  1. Read docs/CONTRIBUTING.md — covers the engine-purity tripwire (analyzers must not import tenant/billing/storage modules), the OpenAPI-description discipline (every Pydantic field needs description=…), and the test pyramid.
  2. Sign off your commits (git commit -s). LintPDF uses the Developer Certificate of Origin to track contributor licensing intent.
  3. Run the tripwires locally — bash scripts/check_engine_purity.sh and python scripts/check_openapi_descriptions.py — and the pytest suite (pytest --no-header).

For larger changes (a new analyzer category, schema migration, public-API addition) open a discussion issue first so we can align on shape before you write code.


Support

  • Hosted productlintpdf.com — fully managed, white-label, billing + admin included.
  • OSS issuesGitHub Issues for bug reports, feature requests, and security disclosures (mark security issues with the security label and we'll triage off-list).
  • Commercialdev@thinkneverland.com for commercial licenses, embedded deployments, or paid support contracts.

LintPDF is a Think Neverland LLC project, originally extracted from the production codebase of the hosted SaaS at lintpdf.com. The extraction itself is documented in docs/audit-phase1.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lint_pdf-0.1.0b7.tar.gz (85.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lint_pdf-0.1.0b7-py3-none-any.whl (2.0 MB view details)

Uploaded Python 3

File details

Details for the file lint_pdf-0.1.0b7.tar.gz.

File metadata

  • Download URL: lint_pdf-0.1.0b7.tar.gz
  • Upload date:
  • Size: 85.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for lint_pdf-0.1.0b7.tar.gz
Algorithm Hash digest
SHA256 f6fa19af78ceb9c6f307005f112b3e2060baf020b4e6231e5df8c49b6d874585
MD5 0ddacaf4d6846b4708609d46d937f6d3
BLAKE2b-256 4b193d33a940c24f1939425d5cec357276b35b8c00b41b72e8c0ae67c226e35a

See more details on using hashes here.

File details

Details for the file lint_pdf-0.1.0b7-py3-none-any.whl.

File metadata

  • Download URL: lint_pdf-0.1.0b7-py3-none-any.whl
  • Upload date:
  • Size: 2.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for lint_pdf-0.1.0b7-py3-none-any.whl
Algorithm Hash digest
SHA256 6ddd60c7739dd54ba431a98eba4506da5058aa98de4cb262de0f144e28820d71
MD5 0e189d2e63a0632f79dcad125406796c
BLAKE2b-256 0b1ca69ad2c603417ba0c26af1bd2ceed41868247a1c5b6bfca899cc5b5fa22f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page