Judgment-first grounded extraction engine. Returns ACCEPT with evidence or STOP with proof. Nothing in between.
Project description
AJT Grounded Extract
Judgment-first grounded extraction engine. Returns ACCEPT with evidence or STOP with proof. Nothing in between.
STOP Is Not Failure
STOP is a judgment. STOP is an audit artifact. STOP is how this system succeeds when evidence is insufficient.
Most systems explain answers. This one explains why it stopped.
Status
v2.1.0 — Audit-ready | Constitution: Frozen | Attack Tests: 10/10 blocked
Installation
pip install ajt-grounded-extract
Zero dependencies. Pure Python stdlib.
Core Principle
Extract structured data only when it can be proven; otherwise stop—and prove that you stopped.
Most systems explain answers. This one explains why it stopped.
Philosophy: STOP-first
- This project does not aim to extract everything.
- Extraction occurs only when evidence is sufficient.
- When evidence is insufficient, the system stops and proves why.
- Evidence Integrity > Recall: Only extract values with verifiable document evidence
- Default: STOP: When evidence is insufficient, conflicting, or missing → stop extraction
- Negative Proof: Every STOP includes explicit reason + preserved artifacts
- No Fine-tuning: Rule-based + LLM extraction without training pipelines
- Local Execution: Runs entirely on local machine
What This Is NOT
This system is blocked-by-design, not secure-by-claim.
- ❌ Multi-domain rule engine
- ❌ Enterprise extraction with thresholds
- ❌ Training/fine-tuning pipeline
- ❌ High-recall extraction system
- ❌ "Secure" or "safe" (we demonstrate how attacks are blocked, not claim safety)
What we guarantee:
- ✅ Stoppability (DEFAULT: STOP)
- ✅ Traceability (decision_maker required)
- ✅ Audit trail (write-once logs)
Architecture
Document → Ingest → Extract → Ground → Judge → Archive
↓ ↓ ↓ ↓ ↓
Hash Candidates Evidence STOP? Artifacts
Pipeline Stages
- Ingest: Load document, compute hash, build line index
- Extract: Find candidate values (rule-based or LLM)
- Ground: Map each value to exact document span (quote + offsets)
- Judge: STOP-first decision:
ACCEPT | STOP | NEED_REVIEW - Archive: Write-once artifacts with timestamps + integrity hashes
Decision Taxonomy
- ACCEPT: Evidence found, confidence sufficient, integrity verified
- STOP: No candidates, conflict, low confidence, or integrity failure
- NEED_REVIEW: Edge cases requiring human judgment
Quick Start
Run Extraction
# ACCEPT case (has clear "Effective Date: 01/15/2025")
python run.py examples/accept_example.txt
# STOP case (no explicit effective date)
python run.py examples/stop_example.txt
View Results
Open generated HTML viewer:
open viewer/accept_example_viewer.html
open viewer/stop_example_viewer.html
Output Format
JSON Result
{
"field_name": "effective_date",
"decision": "ACCEPT",
"value": "01/15/2025",
"evidence": {
"quote": "01/15/2025",
"start": 245,
"end": 255,
"line": 12,
"context": "...Effective Date: 01/15/2025..."
},
"confidence": 0.9
}
STOP Event
{
"field_name": "effective_date",
"decision": "STOP",
"value": null,
"stop_reason": "no_candidates_found",
"stop_proof": {
"searched": true,
"candidates_found": 0
}
}
HTML Viewer Features
- Evidence Highlighting: Green (ACCEPT) / Red (STOP)
- Navigation Sidebar: Jump to extracted fields
- "Why Stopped" Panel: Explicit reasons with proof artifacts
- Offset Mapping: Click evidence span → see exact document location
Directory Structure
ajt-grounded-extract/
├── schema/ # Field definitions
├── engine/ # Core extraction modules
│ ├── ingest.py
│ ├── extract.py
│ ├── ground.py
│ ├── judge.py
│ └── archive.py
├── viewer/ # HTML viewer generator
├── evidence/ # Write-once artifacts (JSONL + manifests)
├── examples/ # Demo documents
└── run.py # CLI entry point
Evidence Requirements
All extractions must satisfy:
- ✅
require_exact_quote: Value must appear verbatim in document - ✅
require_offset_mapping: Quote mapped to byte offsets - ✅
stop_on_conflict: Multiple conflicting values → STOP - ✅
min_confidence: Below threshold → STOP
Acceptance Criteria
- Demo shows at least one ACCEPT and one STOP
- STOP includes explicit reason and preserved artifacts
- Viewer navigates evidence spans correctly
- Non-goals stated explicitly
Regulatory Mapping & Review
This system includes industry-specific regulatory risk mappings for:
- Financial Services — Authorization scope, customer isolation, advisory vs execution separation
- Healthcare — Patient data isolation, complete clinical evidence requirements, clinician traceability
- Legal Practice — Attorney responsibility, client-matter isolation, conflict-of-interest prevention
Navigation: See REGULATORY_REVIEW_GUIDE.md for audience-specific entry points.
Key documents:
- REGULATORY_META_MAP.md — Cross-industry risk-control mappings
- docs/REG_MAP_FINANCE.md — Financial services mapping
- docs/REG_MAP_HEALTHCARE.md — Healthcare mapping
- docs/REG_MAP_LEGAL.md — Legal practice mapping
- COMPLIANCE_GUIDE.md — Audit artifact generation
- ATTACK_TEST.md — Adversarial verification results
Principle: This project demonstrates how specified risks are blocked. It does not claim regulatory compliance.
Reference
Normative Specification
This implementation follows the AJT (Adjudicative Judgment Trace) constitutional framework:
- Spec Repository: ajt-spec — Normative rules and judgment structure
- Reference Implementation: This repository (ajt-grounded-extract) — Executable proof of concept
Relationship:
ajt-spec: Constitutional rules (what must be proven)ajt-grounded-extract: Execution + case law (how it's proven in practice)
Motivation
Motivated by ajt-negative-proof-sim (sealed reference).
Core principle: Prove extraction succeeded OR prove why you stopped.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ajt_grounded_extract-2.1.0.tar.gz.
File metadata
- Download URL: ajt_grounded_extract-2.1.0.tar.gz
- Upload date:
- Size: 18.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2ec1bdc325822c004aa0d6c4922a39c96856e28610a789761a2145b366fc26b9
|
|
| MD5 |
8f8a6f49eb9d68d9f944425a6af3dbaf
|
|
| BLAKE2b-256 |
8ba2e3d8c14e4fb3b34e13abe6271b127affd1f290c6fa77f0eebc2349353d7f
|
File details
Details for the file ajt_grounded_extract-2.1.0-py3-none-any.whl.
File metadata
- Download URL: ajt_grounded_extract-2.1.0-py3-none-any.whl
- Upload date:
- Size: 22.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d4e70c9e7f1b574a1eab4f061ba0c04cb75f72e69f92b4038c845fb122c546ed
|
|
| MD5 |
9d98b55fb7f9b492499975002fe992c6
|
|
| BLAKE2b-256 |
6715e1d481e6fa16d683c57cec8bea338a9488fbea1089693c6ef1b443c10ff7
|