Skip to main content

General-purpose LLM-Wiki CLI and Python library - Build persistent, LLM-maintained knowledge bases

Project description

llmwikify

Build persistent, LLM-maintained knowledge bases — Based on Karpathy's LLM Wiki Principles

PyPI version Python 3.10+ License: MIT Tests: 879+ passing


⚠️ Beta Release — You may encounter bugs or breaking changes. Please report issues on GitHub.


🎯 What is llmwikify?

llmwikify is a general-purpose LLM-Wiki management tool that helps you build and maintain a persistent knowledge base. Unlike RAG systems that rediscover knowledge from scratch on every query, llmwikify incrementally builds and maintains a structured, interlinked wiki that compounds over time.

Core Philosophy

The wiki is a persistent, compounding artifact. The cross-references are already there. The contradictions have already been flagged. The synthesis already reflects everything you've read.

Based on Karpathy's LLM Wiki Principles:

  • 📚 Raw sources — Your immutable source documents in raw/
  • 📝 The wiki — LLM-maintained markdown pages with cross-references
  • ⚙️ The schemawiki.md that tells the LLM how to maintain the wiki

✨ Features

Core

  • SQLite FTS5 search — Porter stemmer, BM25 ranking, 0.06s for 157 pages
  • Bidirectional references — Automatic [[wikilink]] detection with section-level granularity
  • Query compounding — Save query answers as persistent wiki pages (wiki_synthesize)
  • Query sink — Buffer pending updates for later review with urgency tracking

Source Analysis (v0.26.0+)

  • analyze-source CLI with --all and --force support
  • Caches LLM extraction results (entities, relations, suggested pages)
  • Powers schema-aware lint gap detection

Cross-Source Synthesis (v0.28.0+)

  • Detects reinforced claims, contradictions, knowledge gaps across sources
  • Returns suggestions only — human decides what to do with them
  • CLI: llmwikify suggest-synthesis [source]

Smart Lint 2.0 (v0.28.0+)

  • Detects broken links, orphan pages, contradictions, data gaps
  • New: outdated pages, knowledge gaps, redundancy alerts
  • CLI: llmwikify lint [--format=full|brief|recommendations|json]
  • CLI: llmwikify knowledge-gaps

Knowledge Graph (v0.22.0+)

  • LLM auto-extracts concept relationships (8 relation types, 3 confidence levels)
  • Graph queries: neighbors, shortest path, statistics, context
  • Community detection via Leiden/Louvain algorithms
  • Surprise Score reports for unexpected connections

Graph Analyzer (v0.28.0+)

  • PageRank centrality scoring — identify core concepts
  • Hub/Authority analysis — find highly connected pages
  • Community auto-labeling and bridge node detection
  • Suggested page generation for orphan concepts
  • CLI: llmwikify graph-analyze [--json] [--report]

Graph Visualization (v0.23.0+)

  • Interactive HTML (pyvis), SVG (graphviz), GraphML (Gephi)

Additional

  • File extraction — PDF, Word, Excel, PowerPoint, images, audio, YouTube, web URLs via MarkItDown
  • File watcher — Watch raw/ for new files, optional auto-ingest
  • MCP server — 20 tools for LLM/Agent integration
  • Performance — Batch inserts, PRAGMA optimizations, 10-20x faster than naive implementation

📦 Installation

# Basic (zero dependencies)
pip install llmwikify

# Full (all features)
pip install llmwikify[all]

# Development
git clone https://github.com/sn0wfree/llmwikify.git
cd llmwikify
pip install -e ".[dev]"

Optional Extras

Extra Purpose
extractors Enhanced file extraction (PDF, Office, images, audio)
mcp MCP server support
watch File system watching
graph Graph visualization + community detection
web Web UI support
all Everything above

🚀 Quick Start

1. Initialize

llmwikify init
# Creates: raw/, wiki/, wiki.md, .llmwikify.db

2. Ingest Sources

llmwikify ingest document.pdf           # Extract content
llmwikify ingest document.pdf --self-create  # Auto-create wiki pages
llmwikify ingest https://example.com/article
llmwikify ingest https://youtube.com/watch?v=abc123
llmwikify batch raw/pdfs/ --self-create  # Batch ingest

3. Search and Query

llmwikify search "topic" -l 10
llmwikify references "Page Name" --detail
llmwikify lint --format=brief

4. Analyze Knowledge Graph

llmwikify graph-analyze              # PageRank, communities, suggestions
llmwikify graph-analyze --json       # Programmatic output
llmwikify graph-analyze --report     # Detailed suggested pages report
llmwikify suggest-synthesis          # Cross-source synthesis suggestions
llmwikify knowledge-gaps             # Knowledge gap analysis

5. MCP Server for Agents

llmwikify mcp                        # STDIO (default)
llmwikify mcp --transport http       # HTTP
llmwikify serve --web                # MCP + Web UI

💻 Python API

from llmwikify import Wiki
from pathlib import Path

wiki = Wiki(Path("/path/to/wiki"))
wiki.init()

# Ingest source
result = wiki.ingest_source("document.pdf")

# Create pages
wiki.write_page("Test Page", "# Title\n\nContent with [[Link]]", page_type="Concept")

# Search
results = wiki.search("topic", limit=10)

# Synthesize query answers (knowledge compounding)
wiki.synthesize_query(query="Q?", answer="A...", source_pages=["Page1", "Page2"])

# Knowledge graph
engine = wiki.get_relation_engine()
engine.get_neighbors("Concept")
engine.get_path("A", "B")

# Health check
lint_result = wiki.lint(generate_investigations=True)

# Cross-source synthesis
wiki.suggest_synthesis()

# Graph analysis
graph_result = wiki.graph_analyze()

🗄️ MCP Server (20 Tools)

Tool Description
wiki_init Initialize wiki structure
wiki_ingest Ingest a source file
wiki_write_page Write/update a wiki page
wiki_read_page Read a wiki page
wiki_search Full-text search with snippets
wiki_lint Health check
wiki_status Status overview
wiki_log Append log entry
wiki_recommend Get recommendations
wiki_build_index Build reference index
wiki_read_schema Read wiki.md (schema)
wiki_update_schema Update wiki.md
wiki_synthesize Save query answer as wiki page
wiki_sink_status Sink buffer overview
wiki_references Page references
wiki_graph Graph query/modify
wiki_graph_analyze Graph export/detect/report/analyze
wiki_analyze_source Analyze raw source file
wiki_suggest_synthesis Cross-source synthesis suggestions
wiki_knowledge_gaps Knowledge gap + outdated + redundancy

⚙️ Configuration

Create .wiki-config.yaml in your wiki root:

orphan_detection:
  exclude_patterns:
    - '^\d{4}-\d{2}-\d{2}$'  # Date pages
    - '^meeting-.*'           # Meeting notes
  archive_directories:
    - 'archive'
    - 'logs'

llm:
  provider: "openai"
  model: "gpt-4o"
  api_key: "env:OPENAI_API_KEY"

mcp:
  host: "127.0.0.1"
  port: 8765
  transport: "stdio"

See Configuration Guide for full options.


📊 CLI Commands

Command Description Command Description
init Initialize wiki lint Health check
ingest Ingest source status Status overview
analyze-source Analyze source file log Record log
write_page Create page references Show references
read_page Read page build-index Build index
search Full-text search batch Batch ingest
synthesize Save query as page suggest-synthesis Cross-source analysis
sink-status Sink overview knowledge-gaps Gap analysis
watch Watch for files graph-query Graph queries
graph-analyze Graph analysis export-graph Export visualization
community-detect Detect communities report Surprise report
mcp Start MCP server serve MCP + Web UI

📖 Documentation


🧪 Testing

pytest                           # All 879 Python tests
pytest --cov=src/llmwikify       # With coverage
pytest tests/test_p1_features.py # Specific module

🤝 Contributing

Contributions welcome! See CONTRIBUTING.md for development setup, coding standards, and contribution workflow.


🙏 Acknowledgments


📄 License

MIT License — See LICENSE file.

📬 Contact

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmwikify-0.30.0.tar.gz (275.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llmwikify-0.30.0-py3-none-any.whl (142.6 kB view details)

Uploaded Python 3

File details

Details for the file llmwikify-0.30.0.tar.gz.

File metadata

  • Download URL: llmwikify-0.30.0.tar.gz
  • Upload date:
  • Size: 275.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for llmwikify-0.30.0.tar.gz
Algorithm Hash digest
SHA256 a7e6d4275d1c01c50bdc82da6852a300117bcdbcca440ab88818f0edf3a98333
MD5 8906f7c7a336411fdbb5cf0e23bb44c6
BLAKE2b-256 fa8b665dbaf5128938ee15730f908637d2121ab5ecf3306b3e9ff393e60c61ff

See more details on using hashes here.

File details

Details for the file llmwikify-0.30.0-py3-none-any.whl.

File metadata

  • Download URL: llmwikify-0.30.0-py3-none-any.whl
  • Upload date:
  • Size: 142.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for llmwikify-0.30.0-py3-none-any.whl
Algorithm Hash digest
SHA256 91e989c3b182ab88db0ffb6eff74cb1a893d47150fa6182654414096b1121632
MD5 0a918da53b6569a7e9f3801fa0a067c4
BLAKE2b-256 b6447900f6ba4386575db3f57c57d67203edf20e6fa77adcf42afd62b4ccec71

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page