Skip to main content

Objective file-comparison engine for the Print With Synergy stack — reports measured plate ↔ 1-up differences as neutral FACTS (coverage / geometry deltas + presence + diff images). Tolerance + pass/fail policy lives in lint.

Project description

collate

The objective file-comparison engine for the Print With Synergy stack.

collate answers one question — "do these two files match, and where exactly do they differ?" — and answers it with measured facts, never a verdict. Its first capability is plate ↔ 1-up comparison: align a set of decoded separation plates (1-bit TIFF / Esko LEN) against an approved 1-up PDF and report the per-ink coverage and geometry differences, which separations are missing or extra, and (optionally) a per-ink visual difference image and an AI visual note.

Where collate sits in the stack

The Print With Synergy engines split by who owns what:

Engine Owns
codex Extraction — single-file facts (per-separation coverage, screen ruling/angle, Pantone, dieline…).
collate Comparison — two-file difference facts (coverage/geometry deltas, presence, diff images).
lint Policy — rules + verdicts (LPDF_PLATE_CMP_*, pass/fail, tolerances).
lens Display — the visual inspection UX.

collate is an objective-layer engine: it states the numbers. Whether a 3-point coverage delta or a 1.5 mm geometry shift is acceptable is a tolerance decision, and tolerances are policy — they live in lint, not here. collate ships no coverage_mismatch flag, no overall_match roll-up, and no tolerance constant. It measures; lint judges.

collate owns no raster primitives of its own — it reuses codex-pdf for plate decode, the Ghostscript tiffsep separation render, ink normalization, and the Pantone catalogue. codex stays the extraction layer; collate is comparison built on top.

API

Route Purpose
GET /healthz Liveness + capability flags (Ghostscript availability, codex version).
GET /readyz Readiness (the PDF render path degrades gracefully without Ghostscript).
GET /v1/contract Engine / version / schema / routes / capabilities.
POST /v1/compare/plates Compare a plate set (1+ TIFF/LEN files) against a 1-up PDF → CollateCompareResult.

POST /v1/compare/plates is multipart/form-data:

  • plates — one or more separation files (repeat the field per file).
  • pdf — the approved 1-up PDF.
  • page (default 1), dpi (default 150), ai (default false), diff_images (default false).
curl -sS https://<collate-host>/v1/compare/plates \
  -F plates=@cyan.tif -F plates=@magenta.tif -F plates=@black.tif \
  -F pdf=@approved-1up.pdf \
  -F dpi=150 -F diff_images=true

The response is neutral comparison facts — see CollateCompareResult. Errors are RFC 7807 Problem Details (application/problem+json).

When Ghostscript is unavailable the PDF side self-skips: the result carries the plate-side facts, pdf_rendered: false, and a note. A consumer must not read that as a clean compare — lint floors it to INCONCLUSIVE.

Library use

The same comparison runs in process via the client (no HTTP), which is how lint consumes collate when they share a host:

from collate.client import CollateClient

client = CollateClient()  # in-process; or CollateClient(base_url="https://…") for HTTP
result = client.compare_plates(
    [(cyan_bytes, "cyan.tif"), (black_bytes, "black.tif")],
    pdf_bytes,
    dpi=150,
    diff_images=True,
)
for ink in result.inks:
    print(ink.ink_name, ink.presence, ink.coverage_delta_percent)

Local dev

The distribution is published as collate-pdf (the bare collate name is taken on PyPI), matching the codex-pdf family; the import package is collate.

pip install -e . pytest httpx ruff   # pulls codex-pdf from the index
ruff check src tests
pytest                                # gs-free (the PDF side is monkeypatched)
uvicorn collate.api.main:app --reload --port 8080

The comparison's PDF-render path needs Ghostscript (gs) at runtime; the test suite is gs-free, but a real POST /v1/compare/plates against a PDF needs it installed (the Docker image ships it).

License

AGPL-3.0-or-later, matching the engine stack.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

collate_pdf-0.1.0.tar.gz (21.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

collate_pdf-0.1.0-py3-none-any.whl (20.7 kB view details)

Uploaded Python 3

File details

Details for the file collate_pdf-0.1.0.tar.gz.

File metadata

  • Download URL: collate_pdf-0.1.0.tar.gz
  • Upload date:
  • Size: 21.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for collate_pdf-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6ad8b4de81f25fdcb80bf5b524711b6fcf0e6fb7c49c0b9387b51ce1dbc1ffa3
MD5 60f3c3539cfa2bb478d7aee6a9c3946a
BLAKE2b-256 24c79d4d812414bfe08bda9eeafcf4107b6ff0ec1c80ea938005586de8afe56c

See more details on using hashes here.

File details

Details for the file collate_pdf-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: collate_pdf-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 20.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for collate_pdf-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e578f18bb7fdb4f82f49c5a855f7bdc3198c6d82ce3d4aca1a6b94c1d3a84933
MD5 16724a49449ea881fa2cb8a470c2febe
BLAKE2b-256 6c63c14019f431abf58227d479bfe935895e691055756c30461f913f7510eb7e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page