Schema validation pipeline for LLM-generated structured Markdown. CLI · Python API · REST API.

These details have not been verified by PyPI

Project links

Homepage

Project description

AI Knowledge Filler

Validation pipeline for LLM-generated structured Markdown

The Problem

LLMs generate text. You need structured, schema-compliant files.

Without a validation layer, AI-generated Markdown produces:

Error	Raw LLM output	What you need
Enum violation	`level: expert`	`beginner \| intermediate \| advanced`
Domain violation	`domain: Technology`	`domain: system-design`
Type mismatch	`tags: security`	`tags: [security, api, auth]`
Date format	`created: 12-02-2026`	`created: 2026-02-12`

One file? Fixable manually. A hundred files? The schema collapses.

AKF enforces the contract at generation time, not review time.

How It Works

Prompt
  → LLM                  (only non-deterministic component)
  → Validation Engine    (binary: VALID or INVALID + typed E-codes)
  → Error Normalizer     (deterministic repair instructions from E-codes)
  → Retry Controller     (max 3 attempts — aborts on identical failure hash)
  → Commit Gate          (atomic write — only VALID output reaches disk)

No silent failures. No partial commits. No guessing.

Retry = ontology signal. When a domain triggers elevated retries, the taxonomy has a boundary problem — not the model. Telemetry captures this.

Quick Start

pip install ai-knowledge-filler

export GROQ_API_KEY="gsk_..."   # free tier, fastest

# Generate new file
akf generate "Create a Docker networking guide"
# → Docker_Networking_Guide.md (validated, schema-compliant)

# Enrich existing files — add YAML to files that have none
akf enrich docs/

# Validate an entire directory
akf validate --path docs/

AKF Documents Itself

This repo uses AKF to validate its own documentation on every PR.

Setup:

# 1. Define your taxonomy
cat akf.yaml

schema_version: "1.0.0"
vault_path: "./docs"
taxonomy:
  domains:
    - akf-core
    - akf-docs
    - akf-ops
    - akf-spec

# 2. Enrich existing docs — AKF adds frontmatter via LLM
akf enrich docs/ --model groq

# 3. Validate
akf validate --path docs/
# ✅ docs/cli-reference.md
# ✅ docs/user-guide.md
# → Total: 2 | OK: 2 | Errors: 0

CI gate (.github/workflows/validate.yml):

- name: Validate docs/
  run: akf validate --path docs/

Every PR that introduces invalid metadata fails the check. The Validate badge above is AKF validating AKF's own docs.

`akf enrich`

Add YAML frontmatter to existing Markdown files — bulk or single.

akf enrich docs/                    # enrich all .md files
akf enrich docs/ --dry-run          # preview only, no writes
akf enrich docs/ --force            # overwrite valid frontmatter
akf enrich docs/ --output enriched/ # copy to output dir

File state	Default	`--force`
No frontmatter	Generate + validate + write	Same
Incomplete frontmatter	Fill missing fields only	Regenerate all
Valid frontmatter	Skip	Regenerate all
Empty file	Skip with warning	Skip

Enrich runs through the same validation pipeline as generate — retry loop, commit gate, telemetry.

Python API

from akf import Pipeline

pipeline = Pipeline(output="./vault/", model="groq")

# Generate new file
result = pipeline.generate("Create API rate limiting guide")
print(result.success)        # True
print(result.path)           # PosixPath('vault/API_Rate_Limiting_Guide.md')
print(result.attempts)       # 1 (retried if schema violation)

# Enrich existing file
result = pipeline.enrich("docs/old-note.md")
print(result.status)         # "enriched" | "skipped" | "failed"

# Enrich directory
results = pipeline.enrich_dir("docs/")

# Batch generate
results = pipeline.batch_generate([
    "Docker deployment best practices",
    "Kubernetes security hardening",
    "API authentication strategies",
])

# Validate
v = pipeline.validate("vault/my_file.md")
print(v.valid, v.errors)

REST API

akf serve --port 8000

curl -X POST http://localhost:8000/v1/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Create Docker security checklist", "model": "groq"}'

curl -X POST http://localhost:8000/v1/batch \
  -H "Content-Type: application/json" \
  -d '{"prompts": ["Docker guide", "Kubernetes guide"]}'

curl -X POST http://localhost:8000/v1/validate \
  -H "Content-Type: application/json" \
  -d '{"content": "---\ntitle: Test\n..."}'

Endpoints: POST /v1/generate · POST /v1/enrich · POST /v1/validate · POST /v1/batch · GET /v1/models · GET /health

Swagger UI: http://localhost:8000/docs

What Every Committed File Guarantees

Required fields: title, type, domain, level, status, tags, created, updated
Valid enums: type, level, status from controlled sets
Domain from configured taxonomy (akf.yaml) — not hardcoded
ISO 8601 dates with created ≤ updated
tags as array (≥3), title as string

Error Codes

Code	Field	Meaning
E001	type / level / status	Invalid enum value
E002	any	Required field missing
E003	created / updated	Date not ISO 8601
E004	title / tags	Type mismatch
E005	frontmatter	General schema violation
E006	domain	Not in taxonomy
E007	created / updated	`created > updated`

Configuration

# akf.yaml
schema_version: "1.0.0"
vault_path: "./vault"

taxonomy:
  domains:
    - ai-system
    - api-design
    - devops
    - security
    - system-design
    # add your own

enums:
  type: [concept, guide, reference, checklist, project, roadmap, template, audit]
  level: [beginner, intermediate, advanced]
  status: [draft, active, completed, archived]

akf init          # creates akf.yaml in current directory
akf init --force  # overwrite existing

CLI Reference

# Generate
akf generate "prompt" [--model groq|claude|gemini|gpt4|ollama] [--output PATH]

# Enrich
akf enrich PATH [--dry-run] [--force] [--model MODEL] [--output DIR]

# Validate
akf validate [--file FILE] [--path PATH] [--strict]

# Server
akf serve [--host HOST] [--port PORT]

# Models / Init
akf models
akf init [--path DIR] [--force]

Model Selection

Model	Key	Speed	Cost	Notes
Groq	`GROQ_API_KEY`	⚡	Free tier	Recommended for CI, high volume
Claude	`ANTHROPIC_API_KEY`	Medium	$$$	Technical docs, architecture
Gemini	`GOOGLE_API_KEY`	Fast	$	Quick drafts
GPT-4	`OPENAI_API_KEY`	Medium	$$	General purpose
Grok	`XAI_API_KEY`	Fast	$$	General purpose
Ollama	—	Fast	Free	Local / offline / private

Auto-selection order: Groq → Grok → Claude → Gemini → GPT-4 → Ollama.

Telemetry

Each generation appends a structured event to telemetry/events.jsonl:

{
  "generation_id": "uuid-v4",
  "document_id": "abc123",
  "schema_version": "1.0.0",
  "attempt": 1,
  "converged": true,
  "timestamp": "2026-02-27T14:22:01Z",
  "model": "groq",
  "temperature": 0
}

Append-only. Never influences the pipeline at runtime.

Security

export AKF_API_KEY="your-secret"          # optional — unset = dev mode
export AKF_CORS_ORIGINS="https://app.com"

Rate limits: POST /v1/generate 10/min · POST /v1/validate 30/min · POST /v1/batch 3/min

Quality

542 tests, 93.74% coverage
CI green on Python 3.10 / 3.11 / 3.12
Type hints: 100%
Pylint: 9.55/10

Roadmap

Shipped

akf generate, akf enrich, akf validate, akf serve, akf init
Validation pipeline — E001–E007, retry loop, commit gate
Telemetry — append-only JSONL, ontology friction metrics
Config layer — external akf.yaml, no code changes for taxonomy
Pipeline API — from akf import Pipeline
REST API — FastAPI, rate limiting, optional auth
Self-documentation — AKF validates its own docs/ on every PR

Planned

akf generate --batch topics.txt
Graph extraction layer
n8n / Make integration templates

Documentation

Architecture — Module map, data flow, extension points
CLI Reference — All commands, flags, env vars, exit codes
User Guide — Quickstart, enrich workflow, CI integration
Contributing — Dev setup, adding providers

License

MIT — Free for commercial and personal use.

PyPI: https://pypi.org/project/ai-knowledge-filler/ | Version: 0.5.0

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

1.0.10

Mar 19, 2026

1.0.9

Mar 18, 2026

1.0.8

Mar 18, 2026

1.0.7

Mar 18, 2026

1.0.6

Mar 18, 2026

1.0.5

Mar 17, 2026

1.0.4

Mar 17, 2026

1.0.3

Mar 17, 2026

1.0.2

Mar 16, 2026

1.0.1

Mar 13, 2026

1.0.0

Mar 11, 2026

0.6.2

Mar 7, 2026

This version

0.6.1

Mar 2, 2026

0.6.0

Mar 2, 2026

0.5.4

Feb 28, 2026

0.5.3

Feb 27, 2026

0.5.2

Feb 27, 2026

0.5.1

Feb 27, 2026

0.5.0

Feb 27, 2026

0.4.2

Feb 26, 2026

0.4.1

Feb 26, 2026

0.4.0

Feb 26, 2026

0.3.0

Feb 25, 2026

0.2.0

Feb 21, 2026

0.1.4

Feb 20, 2026

0.1.3

Feb 19, 2026

0.1.1

Feb 19, 2026

0.1.0

Feb 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_knowledge_filler-0.6.1.tar.gz (73.8 kB view details)

Uploaded Mar 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ai_knowledge_filler-0.6.1-py3-none-any.whl (56.4 kB view details)

Uploaded Mar 2, 2026 Python 3

File details

Details for the file ai_knowledge_filler-0.6.1.tar.gz.

File metadata

Download URL: ai_knowledge_filler-0.6.1.tar.gz
Upload date: Mar 2, 2026
Size: 73.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for ai_knowledge_filler-0.6.1.tar.gz
Algorithm	Hash digest
SHA256	`2957fcece4e55edca24c5f63bd9741c9e8831c89e39f3f173f9784d3c58bae08`
MD5	`5a4894dddd55acc32e5f11e81c20d244`
BLAKE2b-256	`f69a9ec6fd6a325436850c03b8442da153b046ffa16a96328181a5c14abb4914`

See more details on using hashes here.

File details

Details for the file ai_knowledge_filler-0.6.1-py3-none-any.whl.

File metadata

Download URL: ai_knowledge_filler-0.6.1-py3-none-any.whl
Upload date: Mar 2, 2026
Size: 56.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for ai_knowledge_filler-0.6.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ed9df08e7cd757db88a55a530a1883e2a039cb3e3e1c3d6496bb082e4d3eb2f0`
MD5	`37f6c36a420240a3536e4761de29f539`
BLAKE2b-256	`90df3070ff11c0281751a54f36dedf2dfa17a3ed00826b96b471992fb5bd8b95`

See more details on using hashes here.

ai-knowledge-filler 0.6.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

AI Knowledge Filler

The Problem

How It Works

Quick Start

AKF Documents Itself

akf enrich

Python API

REST API

What Every Committed File Guarantees

Error Codes

Configuration

CLI Reference

Model Selection

Telemetry

Security

Quality

Roadmap

Shipped

Planned

Documentation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`akf enrich`