Knowledge-base readiness validation for RAG & AI agents — config-driven, deterministic pipeline integrity checks.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

TestAutomationArchitect

These details have not been verified by PyPI

Project description

COMPASS — Knowledge-Base Readiness Validation for RAG & AI Agents

COMPASS — COMprehensive Pipeline Analysis & Structure Search

A config-driven, deterministic framework that proves an AI/RAG knowledge base is correctly built, intact, and retrievable — before AI agents consume it. It's the integrity gate beneath answer-quality evaluation: not "are the answers good?" but "is the knowledge base itself sound?"

What COMPASS Does

COMPASS validates a document's journey through your ingestion pipeline and verifies the resulting knowledge base is agent-ready. The pipeline is declarative — you choose which steps run, in what order, in YAML — and ships with these built-in steps:

API Upload → Object Storage → Knowledge Graph → Vector Store → Retrieval → UI Visibility

Enable only the steps your setup has (no object store? no graph? drop them), reorder them, or register your own. For each enabled step it deterministically checks, for example:

Ingestion — the document was accepted and assigned an id
Object storage — it persisted with the right content type, size, and integrity (S3, MinIO, R2, Azure Blob, GCS…)
Knowledge graph — required metadata predicates are present and well-formed (GraphDB, Neptune, Neo4j…)
Vector store — chunks exist with valid embeddings, correct dimensions/metric, sane ordering/overlap, no cross-document contamination (Redis, Qdrant, Pinecone, pgvector…) — 28 toggleable checks
Retrieval — the document's own content is actually retrievable (recall@K, MRR — informational)
UI visibility — it surfaces in the end-user application

Two modes: live validation as documents are ingested, and post-hoc discovery of a corpus that's already loaded.

Scope note: COMPASS validates pipeline & data integrity. It does not judge answer quality — that's the evaluation layer's job. COMPASS is the gate that runs first.

Key Features

✅ Declarative pipeline — choose/reorder/extend steps in YAML; no fixed shape
✅ Config-Driven — zero code changes per project; all customization in YAML
✅ Pluggable backends — storage / vector store / knowledge graph via a provider registry (one file to add a vendor)
✅ Pluggable steps — register custom validators in a step registry; the report adapts automatically
✅ Deterministic & cheap — structural/byte/schema checks, no LLM, CI-gateable
✅ Interactive reporting — self-contained HTML dashboard + JSON + JUnit XML
✅ Trend Tracking — SQLite historical performance analysis
✅ Notifications — Slack/Teams webhooks for alerts
✅ Parallel Execution — process many documents concurrently
✅ Graceful Degradation — non-critical failures don't crash the pipeline

Quick Start

Install the package (core deps only; add extras per backend — see below):

pip install compass-kb-validation
# backends/features on demand, e.g.:
pip install "compass-kb-validation[qdrant,pgvector]"

This installs two commands: compass-validate (live) and compass-discover (post-hoc). Create a project config (config/projects/<name>.yaml) and, for live validation, a test_data/<name>.yaml listing the documents to push. Then:

# Live ingestion — validate documents as they're uploaded through your API
compass-validate --project my-kb --output-format all

# Post-hoc discovery — validate a knowledge base that's already populated
compass-discover --project my-kb --include-retrieval

# Validate connectivity/config only (no backends hit)
compass-validate --project my-kb --dry-run

From source instead of PyPI: git clone … && cd compass && pip install -e ".[all]". Releases are published via PUBLISHING.md.

Reports are written to reports/ (HTML dashboard + JSON + JUnit XML). Add --live to open a real-time progress dashboard, --trends to record history, --parallel for concurrency.

Try it locally (no cloud needed)

A self-contained Docker stack runs the full pipeline against MinIO + Redis Stack + a mock ingestion API — see integration/README.md.

Project Structure

compass/
├── run_validation.py              # Entry point — live ingestion validation
├── discover_and_validate.py       # Entry point — post-hoc discovery
├── ingestion_validation/          # Main package
│   ├── models/                    # StepResult / PipelineResult (stdlib dataclasses)
│   ├── config/                    # Layered YAML config + typed Settings
│   ├── utils/                     # Run-id, logging, polling w/ backoff
│   ├── providers/                 # Pluggable backends via a registry:
│   │   ├── storage.py             #   object storage (S3-family…)
│   │   ├── vectorstore.py         #   vector store (Redis/RediSearch…)
│   │   ├── graph.py               #   knowledge graph (SPARQL, Neptune…)
│   │   └── source.py              #   document discovery (storage|index|manifest)
│   ├── validators/                # BaseValidator + step registry + the steps
│   │   └── registry.py            #   declarative pipeline (StepSpec registry)
│   ├── pipeline/                  # Orchestrator (context propagation, halting)
│   ├── report/                    # Self-contained HTML dashboard + JSON/JUnit
│   ├── corpus.py                  # Corpus-level KB readiness analysis
│   ├── live_dashboard.py          # FastAPI + SSE real-time dashboard
│   ├── notifications.py           # Slack / Teams webhooks
│   └── trend_tracker.py           # SQLite historical trends
├── config/{base.yaml, projects/_template.yaml}
├── test_data/_template.yaml       # Documents to validate (live mode)
├── integration/                   # Local Docker integration stack
├── tests/                         # Unit tests (backends faked — no live services)
└── reports/                       # Generated reports (gitignored)

Documentation

ARCHITECTURE.md — how COMPASS is built: layers, data flow, the registries, the two run modes, extension points
EXTENDING-BACKENDS.md — add a new backend (storage/vector/graph/source) or a new pipeline step
integration/README.md — run the full pipeline locally on Docker
CONTRIBUTING.md · SECURITY.md

Configuration

All behaviour lives in YAML; credentials are referenced by environment-variable name and resolved at runtime (never stored in config). config/base.yaml holds shared defaults; each config/projects/<name>.yaml overlays only what differs and selects a dev/staging/prod environment block.

# config/projects/my-kb.yaml
display_name: "My Knowledge Base"

# Declarative pipeline — choose which steps run, in what order (registry keys).
# Omit to use the full default pipeline.
pipeline:
  steps: [api_upload, s3_storage, redis_chunks, retrieval_quality]
  critical_steps: ["API Upload", "S3 Storage"]   # failure here halts the run

# Where post-hoc discovery finds documents: storage | vectorstore | manifest
discovery:
  provider: vectorstore           # enumerate the KB straight from the index

environments:
  dev:
    api:
      base_url_env: API_BASE_URL          # env-var NAME, not the value
      graphql_mutation: |
        mutation ($file: Upload!, $metadata: JSON) { uploadDocument(file:$file, metadata:$metadata){ status document_id } }
    s3:
      provider: s3                        # s3 | minio | r2 | azure_blob | gcs …
      bucket_name: my-documents
      prefix: knowledge-base/
    redis:
      provider: redis                     # redis | qdrant | pinecone | pgvector …
      search_index: my_embeddings
      expected_embedding_dim: 1536
      expected_distance_metric: COSINE

Advanced Usage

Add a custom pipeline step

Steps are declarative — register a validator and reference it in pipeline.steps:

from ingestion_validation.validators.base import BaseValidator
from ingestion_validation.validators.registry import register_step, StepSpec
from ingestion_validation.models import StepResult, StepStatus

class PiiRedactionValidator(BaseValidator):
    step_name = "PII Redaction"
    def __init__(self, config): self.config = config
    def validate(self, context: dict) -> StepResult:
        return StepResult(self.step_name, StepStatus.PASSED, "no PII leaked")

register_step(StepSpec(
    key="pii_redaction", name="PII Redaction",
    factory=lambda settings, shared: PiiRedactionValidator(settings.s3),
    abbr="PII", order=25,
))
# then:  pipeline.steps: [api_upload, pii_redaction, redis_chunks]

The orchestrator, live dashboard, and HTML report adapt automatically. See EXTENDING-BACKENDS.md for adding backends too.

CI/CD integration — the readiness gate

COMPASS is built to run before your agents or eval consume the knowledge base: wire it into the pipeline that publishes the KB and fail the build when the KB is not retrieval-ready. Every entry point returns a CI-friendly exit code — 0 = gate passed, 1 = gate failed, 2 = config error — and emits JUnit XML for any CI system.

Gate strictness (discovery mode) is yours to choose:

# Strictest: any failed document fails the build (default)
compass-discover --project my-kb

# Tolerate a few failures: require >= 95% of documents to pass
compass-discover --project my-kb --fail-under 95

# Also require the corpus to be READY (completeness/coverage/dedup verdict)
compass-discover --project my-kb --require-ready

GitHub Actions — drop the reusable action into your KB pipeline:

# .github/workflows/kb-readiness.yml
jobs:
  kb-readiness:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4              # your repo holds config/projects/<name>.yaml
      - uses: TestAutomationArchitect/compass@v1
        with:
          project: my-kb
          require-ready: "true"                # block promotion unless READY
          # fail-under: "95"                   # ...or tolerate a bounded failure rate
        env:
          REDIS_HOST: ${{ secrets.REDIS_HOST }}   # config references env-var names; supply them here

A full example you can copy is in examples/kb-readiness.yml.

Container — no Python setup needed; works in any CI or a self-hosted runner:

docker build -t compass:latest .
docker run --rm -v "$PWD:/work" --env-file .env \
  compass:latest compass-discover --project my-kb --require-ready

Applicable Domains

Enterprise document management · legal/compliance · healthcare · financial research · customer-support KBs · internal wikis · e-commerce catalogs · academic repositories — any RAG/agent knowledge base built from a document pipeline.

License

MIT License — see LICENSE.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

TestAutomationArchitect

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.7.0

Jun 4, 2026

1.6.1

Jun 3, 2026

This version

1.2.0

Jun 3, 2026

1.1.0

Jun 2, 2026

1.0.0

Jun 2, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

compass_kb_validation-1.2.0.tar.gz (122.4 kB view details)

Uploaded Jun 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

compass_kb_validation-1.2.0-py3-none-any.whl (115.2 kB view details)

Uploaded Jun 3, 2026 Python 3

File details

Details for the file compass_kb_validation-1.2.0.tar.gz.

File metadata

Download URL: compass_kb_validation-1.2.0.tar.gz
Upload date: Jun 3, 2026
Size: 122.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for compass_kb_validation-1.2.0.tar.gz
Algorithm	Hash digest
SHA256	`257ddbcdb029535c8bd43121bdf164900c94e1d3f4a9bab911cafd8692c2846b`
MD5	`1ce519d949998deca32dcdb2013b925b`
BLAKE2b-256	`879260cdf85054848452978b95d0e8f43fb53b2eb0acc47a7a6e15bf1eba581a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for compass_kb_validation-1.2.0.tar.gz:

Publisher: publish.yml on TestAutomationArchitect/compass

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: compass_kb_validation-1.2.0.tar.gz
- Subject digest: 257ddbcdb029535c8bd43121bdf164900c94e1d3f4a9bab911cafd8692c2846b
- Sigstore transparency entry: 1706588875
- Sigstore integration time: Jun 3, 2026
Source repository:
- Permalink: TestAutomationArchitect/compass@73418f00c41404a0c3385a7b2f9deab27880f007
- Branch / Tag: refs/tags/v1.2.0
- Owner: https://github.com/TestAutomationArchitect
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@73418f00c41404a0c3385a7b2f9deab27880f007
- Trigger Event: release

File details

Details for the file compass_kb_validation-1.2.0-py3-none-any.whl.

File metadata

Download URL: compass_kb_validation-1.2.0-py3-none-any.whl
Upload date: Jun 3, 2026
Size: 115.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for compass_kb_validation-1.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ed2b36eb324f5c6514a8f490b2434b71f1dccd4c9f502ff73c5263ae18b695ef`
MD5	`273b89740f09e8826202da9790774609`
BLAKE2b-256	`def52d47c595385e52c8c07cd93c16a9b114925af7aa0fa25e76501889479b4a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for compass_kb_validation-1.2.0-py3-none-any.whl:

Publisher: publish.yml on TestAutomationArchitect/compass

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: compass_kb_validation-1.2.0-py3-none-any.whl
- Subject digest: ed2b36eb324f5c6514a8f490b2434b71f1dccd4c9f502ff73c5263ae18b695ef
- Sigstore transparency entry: 1706589004
- Sigstore integration time: Jun 3, 2026
Source repository:
- Permalink: TestAutomationArchitect/compass@73418f00c41404a0c3385a7b2f9deab27880f007
- Branch / Tag: refs/tags/v1.2.0
- Owner: https://github.com/TestAutomationArchitect
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@73418f00c41404a0c3385a7b2f9deab27880f007
- Trigger Event: release

compass-kb-validation 1.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

COMPASS — Knowledge-Base Readiness Validation for RAG & AI Agents

What COMPASS Does

Key Features

Quick Start

Try it locally (no cloud needed)

Project Structure

Documentation

Configuration

Advanced Usage

Add a custom pipeline step

CI/CD integration — the readiness gate

Applicable Domains

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance