Skip to main content

Knowledge engineering system — transforms LLMs into deterministic Obsidian file generators

Project description

AI Knowledge Filler

Transform any LLM into a deterministic knowledge base generator

Tests Lint Validate PyPI Python 3.10+ License: MIT Coverage Pylint


Problem → Solution

Problem: LLMs generate inconsistent, unstructured responses that require manual formatting.

Solution: System prompt that transforms any LLM into a deterministic file generator — same input, same structure, every time.

Result: Production-ready Markdown files with validated YAML metadata. Zero manual post-processing.


⚡ Quick Start (60 seconds)

Option 1: pip install (Recommended)

pip install ai-knowledge-filler

# Set at least one API key — Groq is free and fastest to start
export GROQ_API_KEY="gsk_..."          # free at console.groq.com (recommended)
# export ANTHROPIC_API_KEY="sk-ant-..." # or any other provider

# Generate
akf generate "Create Docker security checklist"

Output: outputs/Docker_Security_Checklist.md — production-ready, validated.

Option 2: Claude Projects (No CLI)

1. Open Claude.ai → Create new Project
2. Project Knowledge → Upload akf/system_prompt.md
3. Custom Instructions → Paste akf/system_prompt.md
4. Prompt: "Create guide on API authentication"
5. Done. Claude generates structured files.

What You Get

Core System

  • System Prompt — Transforms LLM from chat to file generator
  • Metadata Standard — YAML structure specification
  • Domain Taxonomy — 30+ classification domains
  • Update Protocol — File merge rules
  • Validation Script — Automated quality gates
  • CLI — Multi-LLM interface (Claude, Gemini, GPT-4, Ollama)

Quality Assurance

  • ✅ 97% test coverage (165 tests)
  • ✅ Automated YAML validation
  • ✅ CI/CD pipelines (GitHub Actions)
  • ✅ Type hints (100% coverage)
  • ✅ Linting (Pylint 9.55/10)

CLI Commands

Generate Files

# Auto-select first available LLM
akf generate "Create Kubernetes deployment guide"

# Specific model
akf generate "Create API checklist" --model claude
akf generate "Create Docker guide" --model gemini
akf generate "Create REST concept" --model gpt4
akf generate "Create microservices reference" --model ollama

Validate Files

# Single file
akf validate --file outputs/Guide.md

# All files in outputs/
akf validate

List Available Models

akf models

# Output:
# ✅ groq      Groq — llama-3.3-70b-versatile
# ❌ grok      Grok (xAI) — Set XAI_API_KEY
# ✅ claude    Claude (Anthropic) — claude-sonnet-4-20250514
# ✅ gemini    Gemini (Google) — gemini-3-flash-preview
# ❌ gpt4      GPT-3.5 (OpenAI) — Set OPENAI_API_KEY
# ✅ ollama    Ollama — llama3.2:3b

Example Output

Input:

Create guide on API rate limiting

Output:

---
title: "API Rate Limiting Strategy"
type: guide
domain: api-design
level: intermediate
status: active
version: v1.0
tags: [api, rate-limiting, performance]
related:
  - "[[API Design Principles]]"
  - "[[System Scalability]]"
created: 2026-02-12
updated: 2026-02-12
---

## Purpose
Comprehensive strategy for implementing API rate limits...

## Core Principles
[Structured content with sections, code examples]

## Implementation
[Step-by-step technical guidance]

## Conclusion
[Summary and next steps]

Every file. Same structure. Validated automatically.


Architecture

User Prompt
    ↓
System Prompt (behavior definition)
    ↓
LLM Provider (Claude/Gemini/GPT-4/Ollama)
    ↓
Structured Markdown + YAML
    ↓
Automated Validation
    ↓
Production-Ready File

Key Insight: System prompt is the source of truth. Same prompt works across all LLMs.

Validation Pipeline (Phase 2.1)

LLM Output
    ↓
Validation Engine  ← deterministic
    ↓
Error Normalizer   ← deterministic
    ↓
Retry Controller   ← non-deterministic (LLM, max 3 attempts)
    ↓
Commit Gate        ← deterministic (schema_version + atomic write)
    ↓
Vault File

Determinism boundary: LLM is the only non-deterministic component.


Model Selection

Model Key Speed Cost Best For
Groq GROQ_API_KEY ⚡ Fastest Free tier First installs, CI, high volume
Grok XAI_API_KEY Fast $$ General purpose
Claude ANTHROPIC_API_KEY Medium $$$ Technical docs, architecture
Gemini GOOGLE_API_KEY Fast $ Quick drafts, summaries
GPT-3.5 OPENAI_API_KEY Medium $$ Versatile content
Ollama Very Fast Free Privacy, offline, local

Auto-selection: CLI tries providers in order: Groq → Grok → Claude → Gemini → GPT-4 → Ollama (first available).


Installation

Via pip (Recommended)

pip install ai-knowledge-filler

From Source

git clone https://github.com/petrnzrnk-creator/ai-knowledge-filler.git
cd ai-knowledge-filler
pip install -r requirements.txt

API Keys

# Set at least one (Groq recommended — free tier available)
export GROQ_API_KEY="gsk_..."          # console.groq.com
export XAI_API_KEY="xai-..."           # console.x.ai
export ANTHROPIC_API_KEY="sk-ant-..."  # console.anthropic.com
export GOOGLE_API_KEY="AIza..."        # aistudio.google.com
export OPENAI_API_KEY="sk-..."         # platform.openai.com

# Or add to ~/.bashrc / ~/.zshrc to persist across sessions:
# export GROQ_API_KEY="gsk_..."

Testing

# Run all tests
pytest --cov=. --cov-report=term-missing -v

# Run validation
akf validate

# Run linting
pylint *.py tests/

Coverage: 97% (165 tests) Linting: Pylint 9.55/10 CI/CD: All checks passing


Use Cases

1. Technical Documentation Generate API docs, architecture decisions, deployment guides.

2. Knowledge Management Structure meeting notes, research findings, learning content.

3. Consulting Deliverables Create frameworks, methodologies, client reports.

4. Batch Processing Generate multiple files programmatically via CLI or API.


File Types

type: concept      # Theoretical entity, definition
type: guide        # Step-by-step process
type: reference    # Specification, standard
type: checklist    # Validation criteria
type: project      # Project description
type: template     # Reusable template

30+ domains: api-design, system-design, devops, security, data-engineering, etc.


Documentation


Advanced Usage

Programmatic Generation

from llm_providers import get_provider

# Auto-select provider
provider = get_provider("auto")

# Load system prompt
with open('akf/system_prompt.md') as f:
    system_prompt = f.read()

# Generate
content = provider.generate(
    prompt="Create API security checklist",
    system_prompt=system_prompt
)

# Save
with open('outputs/Security_Checklist.md', 'w') as f:
    f.write(content)

Batch Processing

cat > topics.txt << 'EOF'
Docker deployment best practices
Kubernetes security hardening
API authentication strategies
EOF

while read topic; do
    akf generate "Create guide on $topic" --model gemini
done < topics.txt

Validation

Automated checks:

  • ✅ YAML frontmatter present
  • ✅ Required fields (title, type, domain, level, status, created, updated)
  • ✅ Valid enum values (type, level, status)
  • ✅ Domain in taxonomy
  • ✅ ISO 8601 dates (YYYY-MM-DD)
  • ✅ Tags array (3+ items)

Output:

✅ outputs/Guide.md
❌ drafts/incomplete.md
   ERROR: Missing field: domain
   ERROR: Invalid type: document

Roadmap

v0.1.x ✅ (Shipped)

  • System Prompt (universal LLM compatibility)
  • YAML Metadata Standard
  • Domain Taxonomy (30+ domains)
  • Validation Script (96% test coverage, 104 tests)
  • Multi-LLM CLI (Claude, Gemini, GPT-4, Ollama)
  • CI/CD Pipelines (GitHub Actions)
  • PyPI package (pip install ai-knowledge-filler)

v0.2.x ✅ (Current)

  • Validation pipeline (Phase 2.1 — ValidationError, Error Normalizer, Retry Controller, Commit Gate)
  • Obsidian vault auto-routing
  • Local model support (llama.cpp endpoint)
  • Enhanced documentation
  • VSCode extension (YAML validation)

License

MIT License — Free for commercial and personal use.


Philosophy

This is knowledge engineering, not chat enhancement.

LLMs are deterministic infrastructure, not conversational toys.

Before: "AI helps me write notes" After: "AI compiles my knowledge base"


Created by: Petr — AI Solutions Architect PyPI: https://pypi.org/project/ai-knowledge-filler/ Repository: https://github.com/petrnzrnk-creator/ai-knowledge-filler Version: 0.2.0


Support


Quick Links: Quick Start | CLI Commands | Documentation | Examples

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_knowledge_filler-0.2.0.tar.gz (32.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_knowledge_filler-0.2.0-py3-none-any.whl (29.6 kB view details)

Uploaded Python 3

File details

Details for the file ai_knowledge_filler-0.2.0.tar.gz.

File metadata

  • Download URL: ai_knowledge_filler-0.2.0.tar.gz
  • Upload date:
  • Size: 32.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ai_knowledge_filler-0.2.0.tar.gz
Algorithm Hash digest
SHA256 8eeeba62a5eac40ac8a1cd83b0e0c814539644eb20afaa779d35b71f5c40cfd4
MD5 ccd69041f4ff950a4848611d61359e37
BLAKE2b-256 97cbc8aaae36c1d386e3ec12f4cf81679d9080fa8fbf3eaa3f95050ab8d742ba

See more details on using hashes here.

File details

Details for the file ai_knowledge_filler-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ai_knowledge_filler-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d1f65905811e5d279572f591cdd7b8511e275a23ce32686a68094b35e733e611
MD5 91ff107e9f84e00fda4da7817906473e
BLAKE2b-256 80c458954764bf0b70f129a4f6f78c3f56e9f3957350df71278eb6c71a6eff74

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page