General-purpose LLM-Wiki CLI and Python library - Build persistent, LLM-maintained knowledge bases
Project description
llmwikify
Build persistent, LLM-maintained knowledge bases — Based on Karpathy's LLM Wiki Principles
⚠️ Beta Release — You may encounter bugs or breaking changes. Please report issues on GitHub.
🎯 What is llmwikify?
llmwikify is a general-purpose LLM-Wiki management tool that helps you build and maintain a persistent knowledge base. Unlike RAG systems that rediscover knowledge from scratch on every query, llmwikify incrementally builds and maintains a structured, interlinked wiki that compounds over time.
Core Philosophy
The wiki is a persistent, compounding artifact. The cross-references are already there. The contradictions have already been flagged. The synthesis already reflects everything you've read.
Based on Karpathy's LLM Wiki Principles:
- 📚 Raw sources — Your immutable source documents in
raw/ - 📝 The wiki — LLM-maintained markdown pages with cross-references
- ⚙️ The schema —
wiki.mdthat tells the LLM how to maintain the wiki
✨ Features
Core
- SQLite FTS5 search — Porter stemmer, BM25 ranking, 0.06s for 157 pages
- Bidirectional references — Automatic
[[wikilink]]detection with section-level granularity - Query compounding — Save query answers as persistent wiki pages (
wiki_synthesize) - Query sink — Buffer pending updates for later review with urgency tracking
Source Analysis (v0.26.0+)
analyze-sourceCLI with--alland--forcesupport- Caches LLM extraction results (entities, relations, suggested pages)
- Powers schema-aware lint gap detection
Cross-Source Synthesis (v0.28.0+)
- Detects reinforced claims, contradictions, knowledge gaps across sources
- Returns suggestions only — human decides what to do with them
- CLI:
llmwikify suggest-synthesis [source]
Smart Lint 2.0 (v0.28.0+)
- Detects broken links, orphan pages, contradictions, data gaps
- New: outdated pages, knowledge gaps, redundancy alerts
- CLI:
llmwikify lint [--format=full|brief|recommendations|json] - CLI:
llmwikify knowledge-gaps
Knowledge Graph (v0.22.0+)
- LLM auto-extracts concept relationships (8 relation types, 3 confidence levels)
- Graph queries: neighbors, shortest path, statistics, context
- Community detection via Leiden/Louvain algorithms
- Surprise Score reports for unexpected connections
Graph Analyzer (v0.28.0+)
- PageRank centrality scoring — identify core concepts
- Hub/Authority analysis — find highly connected pages
- Community auto-labeling and bridge node detection
- Suggested page generation for orphan concepts
- CLI:
llmwikify graph-analyze [--json] [--report]
Graph Visualization (v0.23.0+)
- Interactive HTML (pyvis), SVG (graphviz), GraphML (Gephi)
Additional
- File extraction — PDF, Word, Excel, PowerPoint, images, audio, YouTube, web URLs via MarkItDown
- File watcher — Watch
raw/for new files, optional auto-ingest - MCP server — 20 tools for LLM/Agent integration
- Performance — Batch inserts, PRAGMA optimizations, 10-20x faster than naive implementation
📦 Installation
# Basic (zero dependencies)
pip install llmwikify
# Full (all features)
pip install llmwikify[all]
# Development
git clone https://github.com/sn0wfree/llmwikify.git
cd llmwikify
pip install -e ".[dev]"
Optional Extras
| Extra | Purpose |
|---|---|
extractors |
Enhanced file extraction (PDF, Office, images, audio) |
mcp |
MCP server support |
watch |
File system watching |
graph |
Graph visualization + community detection |
web |
Web UI support |
all |
Everything above |
🚀 Quick Start
1. Initialize
llmwikify init
# Creates: raw/, wiki/, wiki.md, .llmwikify.db
2. Ingest Sources
llmwikify ingest document.pdf # Extract content
llmwikify ingest document.pdf --self-create # Auto-create wiki pages
llmwikify ingest https://example.com/article
llmwikify ingest https://youtube.com/watch?v=abc123
llmwikify batch raw/pdfs/ --self-create # Batch ingest
3. Search and Query
llmwikify search "topic" -l 10
llmwikify references "Page Name" --detail
llmwikify lint --format=brief
4. Analyze Knowledge Graph
llmwikify graph-analyze # PageRank, communities, suggestions
llmwikify graph-analyze --json # Programmatic output
llmwikify graph-analyze --report # Detailed suggested pages report
llmwikify suggest-synthesis # Cross-source synthesis suggestions
llmwikify knowledge-gaps # Knowledge gap analysis
5. MCP Server for Agents
llmwikify mcp # STDIO (default)
llmwikify mcp --transport http # HTTP
llmwikify serve --web # MCP + Web UI
💻 Python API
from llmwikify import Wiki
from pathlib import Path
wiki = Wiki(Path("/path/to/wiki"))
wiki.init()
# Ingest source
result = wiki.ingest_source("document.pdf")
# Create pages
wiki.write_page("Test Page", "# Title\n\nContent with [[Link]]", page_type="Concept")
# Search
results = wiki.search("topic", limit=10)
# Synthesize query answers (knowledge compounding)
wiki.synthesize_query(query="Q?", answer="A...", source_pages=["Page1", "Page2"])
# Knowledge graph
engine = wiki.get_relation_engine()
engine.get_neighbors("Concept")
engine.get_path("A", "B")
# Health check
lint_result = wiki.lint(generate_investigations=True)
# Cross-source synthesis
wiki.suggest_synthesis()
# Graph analysis
graph_result = wiki.graph_analyze()
🗄️ MCP Server (20 Tools)
| Tool | Description |
|---|---|
wiki_init |
Initialize wiki structure |
wiki_ingest |
Ingest a source file |
wiki_write_page |
Write/update a wiki page |
wiki_read_page |
Read a wiki page |
wiki_search |
Full-text search with snippets |
wiki_lint |
Health check |
wiki_status |
Status overview |
wiki_log |
Append log entry |
wiki_recommend |
Get recommendations |
wiki_build_index |
Build reference index |
wiki_read_schema |
Read wiki.md (schema) |
wiki_update_schema |
Update wiki.md |
wiki_synthesize |
Save query answer as wiki page |
wiki_sink_status |
Sink buffer overview |
wiki_references |
Page references |
wiki_graph |
Graph query/modify |
wiki_graph_analyze |
Graph export/detect/report/analyze |
wiki_analyze_source |
Analyze raw source file |
wiki_suggest_synthesis |
Cross-source synthesis suggestions |
wiki_knowledge_gaps |
Knowledge gap + outdated + redundancy |
⚙️ Configuration
Create .wiki-config.yaml in your wiki root:
orphan_detection:
exclude_patterns:
- '^\d{4}-\d{2}-\d{2}$' # Date pages
- '^meeting-.*' # Meeting notes
archive_directories:
- 'archive'
- 'logs'
llm:
provider: "openai"
model: "gpt-4o"
api_key: "env:OPENAI_API_KEY"
mcp:
host: "127.0.0.1"
port: 8765
transport: "stdio"
See Configuration Guide for full options.
📊 CLI Commands
| Command | Description | Command | Description |
|---|---|---|---|
init |
Initialize wiki | lint |
Health check |
ingest |
Ingest source | status |
Status overview |
analyze-source |
Analyze source file | log |
Record log |
write_page |
Create page | references |
Show references |
read_page |
Read page | build-index |
Build index |
search |
Full-text search | batch |
Batch ingest |
synthesize |
Save query as page | suggest-synthesis |
Cross-source analysis |
sink-status |
Sink overview | knowledge-gaps |
Gap analysis |
watch |
Watch for files | graph-query |
Graph queries |
graph-analyze |
Graph analysis | export-graph |
Export visualization |
community-detect |
Detect communities | report |
Surprise report |
mcp |
Start MCP server | serve |
MCP + Web UI |
📖 Documentation
- Architecture — Technical architecture, data flows, components
- Configuration Guide — Detailed config options
- LLM Wiki Principles — Karpathy's original vision
- Migration Guide — Version migration notes
- Contributing — Development workflow
- Known Issues — Known issues and planned fixes
🧪 Testing
pytest # All 879 Python tests
pytest --cov=src/llmwikify # With coverage
pytest tests/test_p1_features.py # Specific module
🤝 Contributing
Contributions welcome! See CONTRIBUTING.md for development setup, coding standards, and contribution workflow.
🙏 Acknowledgments
- llm-wiki-kit — Original inspiration
- Andrej Karpathy — LLM Wiki Principles
- Obsidian — Markdown wiki platform
- MCP — Model Context Protocol
📄 License
MIT License — See LICENSE file.
📬 Contact
- GitHub: @sn0wfree
- Email: linlu1234567@sina.com
- Discussions: GitHub Discussions
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llmwikify-0.30.0.tar.gz.
File metadata
- Download URL: llmwikify-0.30.0.tar.gz
- Upload date:
- Size: 275.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a7e6d4275d1c01c50bdc82da6852a300117bcdbcca440ab88818f0edf3a98333
|
|
| MD5 |
8906f7c7a336411fdbb5cf0e23bb44c6
|
|
| BLAKE2b-256 |
fa8b665dbaf5128938ee15730f908637d2121ab5ecf3306b3e9ff393e60c61ff
|
File details
Details for the file llmwikify-0.30.0-py3-none-any.whl.
File metadata
- Download URL: llmwikify-0.30.0-py3-none-any.whl
- Upload date:
- Size: 142.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
91e989c3b182ab88db0ffb6eff74cb1a893d47150fa6182654414096b1121632
|
|
| MD5 |
0a918da53b6569a7e9f3801fa0a067c4
|
|
| BLAKE2b-256 |
b6447900f6ba4386575db3f57c57d67203edf20e6fa77adcf42afd62b4ccec71
|