Git pre-commit hook + web dashboard for detecting LLM-induced document corruption
Project description
Document Integrity Layer
Detect LLM-induced document corruption before it ships.
What is this?
Document Integrity Layer is a Git pre-commit hook and web dashboard that catches document corruption introduced by AI assistants in real-time. It scans Word, PDF, and Markdown files for hallucinated citations, broken cross-references, malformed tables, and formatting inconsistencies—then generates audit trails proving exactly what changed and why. Built for developers and technical writers who delegate writing to Claude, ChatGPT, or similar tools and need verification that the AI didn't silently break your work.
Features
- Git pre-commit scanning — Automatically checks staged documents before commits
- Multi-format support — Analyzes .docx, .pdf, and .md files with semantic understanding
- Corruption detection — Identifies hallucinated URLs, broken internal links, table structure corruption, and citation inconsistencies
- Web dashboard — Visual history of all integrity checks across your repository
- Audit trails — Export compliance-ready reports showing what changed and when
- Slack/Discord alerts — Real-time notifications when corruption is detected
- Custom rule configuration — Define project-specific validation rules in
.dil.toml - Docker-ready — Run locally or containerized; no external dependencies required
Quick Start
Installation
pip install document-integrity-layer
Setup
Initialize in your repository:
dil init
This creates .dil.toml with default configuration. Install the Git hook:
dil install-hook
Configuration
Edit .dil.toml to customize detection rules:
[scanner]
check_citations = true
check_cross_references = true
check_table_integrity = true
check_formatting = true
[alerts]
slack_webhook = "https://hooks.slack.com/services/YOUR/WEBHOOK"
Usage
CLI
Run a one-time scan:
dil scan document.docx
Scan an entire directory:
dil scan ./docs --recursive
View integrity history:
dil history
Web Dashboard
Start the dashboard server:
dil server --port 8000
Navigate to http://localhost:8000 to explore:
- Scan history across commits
- Corruption reports with side-by-side diffs
- Citation and link validation results
- Custom audit exports
Pre-commit Hook
Once installed, the hook runs automatically:
git add my-document.docx
git commit -m "Update docs"
# → Pre-commit hook scans my-document.docx
# → Reports corruption if found
# → Blocks commit if severity threshold exceeded
Tech Stack
- Python 3.9+ — Core language
- Flask — Web dashboard and API
- python-docx — DOCX parsing and analysis
- PyPDF2 — PDF text extraction
- Markdown — Native MD support
- SQLite — Audit trail storage
- Docker — Containerized deployment
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file document_integrity_layer-0.1.0.tar.gz.
File metadata
- Download URL: document_integrity_layer-0.1.0.tar.gz
- Upload date:
- Size: 14.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3caab38c50956ef5722a49250689a9891e1d2219c80b517f8087bb6a5bad89e9
|
|
| MD5 |
ccc87118bbc5d8b5575574d249604e22
|
|
| BLAKE2b-256 |
38c905a982cb9536b06186cdab20c995ff405cae01887e6164869eae722aeec9
|
File details
Details for the file document_integrity_layer-0.1.0-py3-none-any.whl.
File metadata
- Download URL: document_integrity_layer-0.1.0-py3-none-any.whl
- Upload date:
- Size: 17.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
af19aa71e6b506277b75ec590f2d25e5c3d307600507335974d70cd8df62cef1
|
|
| MD5 |
1ea215d563321de0e8e64ed93c6a7221
|
|
| BLAKE2b-256 |
759e7c16a17412e0956355f13633b1d5b80bc9b0e3ae7d90cb7bdaf5a2b68e99
|