Authoritative, versioned PDF facts contract for Think Neverland tools.
Project description
title: "Overview" description: "Authoritative read-only PDF facts engine for Think Neverland tools. Versioned contract, schema-validated output, and consumer-agnostic extraction." group: "Getting started" order: 1 slug: "overview"
codexPDF
codexPDF is Think Neverland's authoritative, read-only PDF facts reference.
Other engines consult codexPDF for canonical document facts instead of
re-parsing PDFs independently. The contract is versioned and schema-validated.
Status
Current baseline includes:
- Python package (
codex_pdf) with typed models - CLI (
codex-pdf extract|schema|validate|probe|parity) - Versioned schemas in
schemas/v1/ - Golden output harness under
tests/golden/
Quickstart
uv sync
uv run codex-pdf probe input.pdf --json
uv run codex-pdf extract input.pdf --pretty > out.json
uv run codex-pdf validate out.json
uv run codex-pdf parity --fixtures-root tests/fixtures --profile summary --max-files 5
uv run codex-pdf parity --fixtures-root tests/fixtures --profile inventory --max-files 5
uv run codex-pdf parity --fixtures-root tests/fixtures --profile deep --max-files 5
Optional external baseline comparison (consumer-specific adapter provided at runtime):
uv run codex-pdf parity \
--fixtures-root /path/to/pdfs \
--profile summary \
--baseline-command "<your_command_with_{pdf}_placeholder>"
Contract
The public API is the JSON contract rooted at CodexDocument.
- Schema path:
schemas/v1/codex-document.schema.json - Runtime model:
codex_pdf.models.v1.CodexDocument - Stability policy: SemVer (
majorfor breaking contract changes)
Documentation
| Topic | Doc |
|---|---|
| Architecture and boundaries | docs/architecture.md |
| CLI commands and usage patterns | docs/cli.md |
| Contract and schema versioning | docs/contract.md |
| Parity profiles and baselines | docs/parity.md |
| Preflight ingest adapters | docs/preflight-ingest.md |
| Migration sequencing | docs/migration-plan.md |
| Legacy discovery audit | docs/discovery-audit.md |
| Backward compatibility requirements | docs/backward-compatibility.md |
| Cleanup stop-gates policy | docs/cleanup-stop-gates.md |
License
AGPL-3.0-or-later.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file codex_pdf-1.4.0.tar.gz.
File metadata
- Download URL: codex_pdf-1.4.0.tar.gz
- Upload date:
- Size: 11.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ec88450363d1b9d8f88c69669a6b8630697cabe822484dab54b34c4074aa940f
|
|
| MD5 |
9f850c815da7000d2d842242fdd7a82f
|
|
| BLAKE2b-256 |
c56e90516ce3eb12d905710257c3f498128f1fb3df33b6d4d2e3e2a76fec7f0f
|
File details
Details for the file codex_pdf-1.4.0-py3-none-any.whl.
File metadata
- Download URL: codex_pdf-1.4.0-py3-none-any.whl
- Upload date:
- Size: 566.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
55177bdf40d8dcded88de9b3d7a3b685c1779e31001585d0389592f5f9a5d046
|
|
| MD5 |
75ac5335704546955c813261a0bf71bc
|
|
| BLAKE2b-256 |
313d7f3f6dcf0132a202b6757a6089be44c75fe28bf4535058b52d64329e119f
|