Deterministic knowledge compiler for LLM output
Project description
AI Knowledge Filler
Deterministic knowledge compiler for LLM output
What is AKF
AKF (AI Knowledge Filler) is an AI-Native Cognitive Operating System — a deterministic validation pipeline that turns LLM output into schema-compliant, ontology-governed knowledge files.
LLMs generate text. Text is not knowledge.
What it is not: a note-taking app, a chat assistant, a markdown generator, an Obsidian plugin. What it is: the operating contract for a knowledge base that scales.
The Problem
Without a validation layer, AI-generated content produces:
| Error | Example |
|---|---|
| Domain violation | domain: Technology → valid: domain: system-design |
| Enum violation | level: expert → valid: beginner | intermediate | advanced |
| Type mismatch | tags: security → valid: tags: [security, api, auth] |
| Date format | created: 12-02-2026 → valid: created: 2026-02-12 |
Each error is trivial. Across hundreds of files, they make a vault unsearchable, Dataview queries return nothing, and the knowledge graph becomes noise.
AKF solves this at the generation layer, not the review layer.
What Every Committed File Guarantees
- Required fields:
title,type,domain,level,status,tags,created,updated - Valid enums:
type,level,statusfrom controlled sets - Domain from configured taxonomy (
akf.yaml) — not hardcoded - ISO 8601 dates with
created ≤ updated tagsas array (≥3),titleas string — no type mismatches
Violations produce error codes E001–E007. Retry instructions are derived from those codes, not from free-form prompts.
Retry = Ontology Signal
Retry pressure is not a failure metric. When a domain triggers elevated retries, the taxonomy has a boundary problem — not the model. Telemetry captures this signal. Ontology improves from data, not intuition.
⚡ Quick Start (60 seconds)
Option 1: pip install (Recommended)
pip install ai-knowledge-filler
# Set at least one API key — Groq is free and fastest to start
export GROQ_API_KEY="gsk_..." # free at console.groq.com (recommended)
# export ANTHROPIC_API_KEY="sk-ant-..." # or any other provider
# Generate
akf generate "Create Docker security checklist"
Output: outputs/Docker_Security_Checklist.md — production-ready, validated.
Option 2: Claude Projects (No CLI)
1. Open Claude.ai → Create new Project
2. Project Knowledge → Upload akf/system_prompt.md
3. Custom Instructions → Paste akf/system_prompt.md
4. Prompt: "Create guide on API authentication"
5. Done. Claude generates structured files.
What You Get
Core System
- System Prompt — Transforms LLM from chat to file generator
- Metadata Standard — YAML structure specification
- Domain Taxonomy — 30+ classification domains
- Update Protocol — File merge rules
- Validation Script — Automated quality gates
- CLI — Multi-LLM interface (Claude, Gemini, GPT-4, Ollama)
Quality Assurance
- ✅ 97% test coverage (165 tests)
- ✅ Automated YAML validation
- ✅ CI/CD pipelines (GitHub Actions)
- ✅ Type hints (100% coverage)
- ✅ Linting (Pylint 9.55/10)
CLI Commands
Generate Files
# Auto-select first available LLM
akf generate "Create Kubernetes deployment guide"
# Specific model
akf generate "Create API checklist" --model claude
akf generate "Create Docker guide" --model gemini
akf generate "Create REST concept" --model gpt4
akf generate "Create microservices reference" --model ollama
Validate Files
# Single file
akf validate --file outputs/Guide.md
# All files in outputs/
akf validate
List Available Models
akf models
# Output:
# ✅ groq Groq — llama-3.3-70b-versatile
# ❌ grok Grok (xAI) — Set XAI_API_KEY
# ✅ claude Claude (Anthropic) — claude-sonnet-4-20250514
# ✅ gemini Gemini (Google) — gemini-3-flash-preview
# ❌ gpt4 GPT-3.5 (OpenAI) — Set OPENAI_API_KEY
# ✅ ollama Ollama — llama3.2:3b
Example Output
Input:
Create guide on API rate limiting
Output:
---
title: "API Rate Limiting Strategy"
type: guide
domain: api-design
level: intermediate
status: active
version: v1.0
tags: [api, rate-limiting, performance]
related:
- "[[API Design Principles]]"
- "[[System Scalability]]"
created: 2026-02-12
updated: 2026-02-12
---
## Purpose
Comprehensive strategy for implementing API rate limits...
## Core Principles
[Structured content with sections, code examples]
## Implementation
[Step-by-step technical guidance]
## Conclusion
[Summary and next steps]
Every file. Same structure. Validated automatically.
Architecture
User Prompt
↓
System Prompt (behavior definition)
↓
LLM Provider (Claude/Gemini/GPT-4/Ollama)
↓
Structured Markdown + YAML
↓
Automated Validation
↓
Production-Ready File
Key Insight: System prompt is the source of truth. Same prompt works across all LLMs.
Validation Pipeline (Phase 2.1)
LLM Output
↓
Validation Engine ← deterministic
↓
Error Normalizer ← deterministic
↓
Retry Controller ← non-deterministic (LLM, max 3 attempts)
↓
Commit Gate ← deterministic (schema_version + atomic write)
↓
Vault File
Determinism boundary: LLM is the only non-deterministic component.
Model Selection
| Model | Key | Speed | Cost | Best For |
|---|---|---|---|---|
| Groq | GROQ_API_KEY |
⚡ Fastest | Free tier | First installs, CI, high volume |
| Grok | XAI_API_KEY |
Fast | $$ | General purpose |
| Claude | ANTHROPIC_API_KEY |
Medium | $$$ | Technical docs, architecture |
| Gemini | GOOGLE_API_KEY |
Fast | $ | Quick drafts, summaries |
| GPT-3.5 | OPENAI_API_KEY |
Medium | $$ | Versatile content |
| Ollama | — | Very Fast | Free | Privacy, offline, local |
Auto-selection: CLI tries providers in order: Groq → Grok → Claude → Gemini → GPT-4 → Ollama (first available).
Installation
Via pip (Recommended)
pip install ai-knowledge-filler
From Source
git clone https://github.com/petrnzrnk-creator/ai-knowledge-filler.git
cd ai-knowledge-filler
pip install -r requirements.txt
API Keys
# Set at least one (Groq recommended — free tier available)
export GROQ_API_KEY="gsk_..." # console.groq.com
export XAI_API_KEY="xai-..." # console.x.ai
export ANTHROPIC_API_KEY="sk-ant-..." # console.anthropic.com
export GOOGLE_API_KEY="AIza..." # aistudio.google.com
export OPENAI_API_KEY="sk-..." # platform.openai.com
# Or add to ~/.bashrc / ~/.zshrc to persist across sessions:
# export GROQ_API_KEY="gsk_..."
Testing
# Run all tests
pytest --cov=. --cov-report=term-missing -v
# Run validation
akf validate
# Run linting
pylint *.py tests/
Coverage: 97% (165 tests) Linting: Pylint 9.55/10 CI/CD: All checks passing
Use Cases
1. Technical Documentation Generate API docs, architecture decisions, deployment guides.
2. Knowledge Management Structure meeting notes, research findings, learning content.
3. Consulting Deliverables Create frameworks, methodologies, client reports.
4. Batch Processing Generate multiple files programmatically via CLI or API.
File Types
type: concept # Theoretical entity, definition
type: guide # Step-by-step process
type: reference # Specification, standard
type: checklist # Validation criteria
type: project # Project description
type: template # Reusable template
30+ domains: api-design, system-design, devops, security, data-engineering, etc.
Documentation
- System Prompt — LLM behavior definition
- User Guide — Installation, quick start, troubleshooting
- CLI Reference — All commands, flags, env vars, exit codes
- Architecture — Module map, data flow, extension points
- Contributing — Dev setup, quality gates, adding providers
Advanced Usage
Programmatic Generation
from llm_providers import get_provider
# Auto-select provider
provider = get_provider("auto")
# Load system prompt
with open('akf/system_prompt.md') as f:
system_prompt = f.read()
# Generate
content = provider.generate(
prompt="Create API security checklist",
system_prompt=system_prompt
)
# Save
with open('outputs/Security_Checklist.md', 'w') as f:
f.write(content)
Batch Processing
cat > topics.txt << 'EOF'
Docker deployment best practices
Kubernetes security hardening
API authentication strategies
EOF
while read topic; do
akf generate "Create guide on $topic" --model gemini
done < topics.txt
Validation
Automated checks:
- ✅ YAML frontmatter present
- ✅ Required fields (title, type, domain, level, status, created, updated)
- ✅ Valid enum values (type, level, status)
- ✅ Domain in taxonomy
- ✅ ISO 8601 dates (YYYY-MM-DD)
- ✅ Tags array (3+ items)
Output:
✅ outputs/Guide.md
❌ drafts/incomplete.md
ERROR: Missing field: domain
ERROR: Invalid type: document
Roadmap
v0.1.x ✅ (Shipped)
- System Prompt (universal LLM compatibility)
- YAML Metadata Standard
- Domain Taxonomy (30+ domains)
- Validation Script (96% test coverage, 104 tests)
- Multi-LLM CLI (Claude, Gemini, GPT-4, Ollama)
- CI/CD Pipelines (GitHub Actions)
- PyPI package (
pip install ai-knowledge-filler)
v0.2.x ✅ (Shipped)
- Validation pipeline (Phase 2.1 — ValidationError, Error Normalizer, Retry Controller, Commit Gate)
- Hard enum enforcement — E001–E006, 97% coverage (Phase 2.2)
- Telemetry layer — append-only JSONL, generation_id, convergence metrics (Phase 2.3)
v0.3.0 🔄 (Current)
- Config layer — external
akf.yaml, taxonomy configurable without code changes (Phase 2.4) -
akf init— generatesakf.yamlfor a new vault - Validator Model D —
created ≤ updated(E007),titleisinstance str (E004) - PyPI publish pending tag
v1.0.0 (Planned — Phase 2.5)
- Onboarding & public announcement
License
MIT License — Free for commercial and personal use.
Philosophy
This is knowledge engineering, not chat enhancement.
LLMs are deterministic infrastructure, not conversational toys.
Before: "AI helps me write notes" After: "AI compiles my knowledge base"
Created by: Petr — AI Solutions Architect PyPI: https://pypi.org/project/ai-knowledge-filler/ Repository: https://github.com/petrnzrnk-creator/ai-knowledge-filler Version: 0.3.0
Support
- Issues: GitHub Issues
- Discussions: GitHub Discussions
Quick Links: Quick Start | CLI Commands | Documentation | Examples
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ai_knowledge_filler-0.3.0.tar.gz.
File metadata
- Download URL: ai_knowledge_filler-0.3.0.tar.gz
- Upload date:
- Size: 60.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c7ea41b98e32a51ff2fe1992a2c3405524d41f6f4cad5d09594e79e10d418f00
|
|
| MD5 |
6813a764b1872087aa1c868cfc578834
|
|
| BLAKE2b-256 |
2774b7ea4058e888dc3b7c96b86d198d323514a7df95c9f65e31657cc0fb2b5b
|
File details
Details for the file ai_knowledge_filler-0.3.0-py3-none-any.whl.
File metadata
- Download URL: ai_knowledge_filler-0.3.0-py3-none-any.whl
- Upload date:
- Size: 43.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
911deaf4ca6afae5ad64ae77137b2d91c91b17963bd869ce5b16ce659b5e1373
|
|
| MD5 |
a21ff06b0b5720b8b27d0f44fe0588d7
|
|
| BLAKE2b-256 |
1ee955ddf9a672a54447368ce4ae9d9c18c48d4c6c6b8c04856f470bb0fd419d
|