Skip to main content

Objective file-comparison engine for the Print With Synergy stack — reports measured plate ↔ 1-up differences as neutral FACTS (coverage / geometry deltas + presence + diff images). Tolerance + pass/fail policy lives in lint.

Project description

collate

The objective file-comparison engine for the Print With Synergy stack.

collate answers one question — "do these two files match, and where exactly do they differ?" — and answers it with measured facts, never a verdict. Its first capability is plate ↔ 1-up comparison: align a set of decoded separation plates (1-bit TIFF / Esko LEN) against an approved 1-up PDF and report the per-ink coverage and geometry differences, which separations are missing or extra, and (optionally) a per-ink visual difference image and an AI visual note.

Where collate sits in the stack

The Print With Synergy engines split by who owns what:

Engine Owns
codex Extraction — single-file facts (per-separation coverage, screen ruling/angle, Pantone, dieline…).
collate Comparison — two-file difference facts (coverage/geometry deltas, presence, diff images).
lint Policy — rules + verdicts (LPDF_PLATE_CMP_*, pass/fail, tolerances).
lens Display — the visual inspection UX.

collate is an objective-layer engine: it states the numbers. Whether a 3-point coverage delta or a 1.5 mm geometry shift is acceptable is a tolerance decision, and tolerances are policy — they live in lint, not here. collate ships no coverage_mismatch flag, no overall_match roll-up, and no tolerance constant. It measures; lint judges.

collate owns no raster primitives of its own — it reuses codex-pdf for plate decode, the Ghostscript tiffsep separation render, ink normalization, and the Pantone catalogue. codex stays the extraction layer; collate is comparison built on top.

API

Route Purpose
GET /healthz Liveness + capability flags (Ghostscript availability, codex version).
GET /readyz Readiness (the PDF render path degrades gracefully without Ghostscript).
GET /v1/contract Engine / version / schema / routes / capabilities.
POST /v1/compare/plates Compare a plate set (1+ TIFF/LEN files) against a 1-up PDF → CollateCompareResult.

POST /v1/compare/plates is multipart/form-data:

  • plates — one or more separation files (repeat the field per file).
  • pdf — the approved 1-up PDF.
  • page (default 1), dpi (default 150), ai (default false), diff_images (default false).
curl -sS https://<collate-host>/v1/compare/plates \
  -F plates=@cyan.tif -F plates=@magenta.tif -F plates=@black.tif \
  -F pdf=@approved-1up.pdf \
  -F dpi=150 -F diff_images=true

The response is neutral comparison facts — see CollateCompareResult. Errors are RFC 7807 Problem Details (application/problem+json).

When Ghostscript is unavailable the PDF side self-skips: the result carries the plate-side facts, pdf_rendered: false, and a note. A consumer must not read that as a clean compare — lint floors it to INCONCLUSIVE.

Library use

The same comparison runs in process via the client (no HTTP), which is how lint consumes collate when they share a host:

from collate.client import CollateClient

client = CollateClient()  # in-process; or CollateClient(base_url="https://…") for HTTP
result = client.compare_plates(
    [(cyan_bytes, "cyan.tif"), (black_bytes, "black.tif")],
    pdf_bytes,
    dpi=150,
    diff_images=True,
)
for ink in result.inks:
    print(ink.ink_name, ink.presence, ink.coverage_delta_percent)

Local dev

The distribution is published as collate-pdf (the bare collate name is taken on PyPI), matching the codex-pdf family; the import package is collate.

pip install -e . pytest httpx ruff   # pulls codex-pdf from the index
ruff check src tests
pytest                                # gs-free (the PDF side is monkeypatched)
uvicorn collate.api.main:app --reload --port 8080

The comparison's PDF-render path needs Ghostscript (gs) at runtime; the test suite is gs-free, but a real POST /v1/compare/plates against a PDF needs it installed (the Docker image ships it).

License

AGPL-3.0-or-later, matching the engine stack.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

collate_pdf-0.2.0.tar.gz (34.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

collate_pdf-0.2.0-py3-none-any.whl (31.4 kB view details)

Uploaded Python 3

File details

Details for the file collate_pdf-0.2.0.tar.gz.

File metadata

  • Download URL: collate_pdf-0.2.0.tar.gz
  • Upload date:
  • Size: 34.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for collate_pdf-0.2.0.tar.gz
Algorithm Hash digest
SHA256 ba8057679f59218f5860990df8897e828d6f7953dd944cb61a05b0cd0aa99806
MD5 5c296e393e78936d2991eb6de8caede9
BLAKE2b-256 be694fcb6f72000de14b4e8a1c33979ff16eb0bf59bf76acc4878bce9500d4d1

See more details on using hashes here.

File details

Details for the file collate_pdf-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: collate_pdf-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 31.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for collate_pdf-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5e3701bb65fd22302455b57bbffd913d2d325e25dbb37b23208ba1c348c2bb94
MD5 cda5be7035dbfd87e7d7be78fdbb7653
BLAKE2b-256 e47a4d8aaee729125c747c54bf2729d81547fbd1d9c1b05c72ed7bd7be8b55d8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page