LLM Wiki — Auto-generate knowledge base from code & docs. CGC code intelligence, Q&A chat, knowledge graph, drift detection.
Project description
llm-wiki
Karpathy-style LLM Wiki compiler as a proper Python library.
Ingests raw markdown sources into a structured interlinked wiki with:
- Sources / Entities / Concepts hierarchy
- 4-tier fuzzy dedup for concept pages
- Async Gemini client with retry + rate-limit handling
- Robust JSON parser for LLM output (3-tier recovery)
- Post-processing fix_links for broken wikilinks
Install
pip install -e ".[dev]"
Quick start
export GEMINI_API_KEY=your-key
# Init vault
mkdir my-wiki && cd my-wiki
mkdir raw wiki
echo "# SkyJoy is a loyalty program" > raw/intro.md
# Ingest
llm-wiki ingest raw/intro.md --vault .
# Query
llm-wiki query "What is SkyJoy?" --vault .
# Fix broken wikilinks
llm-wiki fix-links --vault . --apply
Code-to-Doc (ingest-code)
Generate wiki documentation directly from source code:
# Basic: Python-only call graph
llm-wiki ingest-code ./my-project --vault .
# With CGC: 19-language call graph + operational params extraction
pip install codegraphcontext kuzu
llm-wiki ingest-code ./my-project --vault . --cgc
# Incremental update (only changed modules)
llm-wiki ingest-code ./my-project --vault . --cgc --update
# Serve wiki with Q&A chat
llm-wiki serve --vault . --port 5757
CGC augmentation (--cgc)
When enabled, CodeGraphContext (MIT license) provides:
| Feature | Without --cgc |
With --cgc |
|---|---|---|
| Call graph | Python-only (~200 edges) | 19 languages, 1400+ edges |
| Source budget | 8,000 chars/module | 16,000 chars/module |
| Operational params | Not extracted | Cron schedules, timeouts, constants, env vars |
| Class hierarchy | Not available | Parents, children, methods |
CGC auto-indexes on first run (~2 min for 300 files). Index is cached in .cgc_index/.
Programmatic API
import asyncio
from pathlib import Path
from llm_wiki import GeminiClient, Settings, Vault, ingest_source
async def main():
settings = Settings(vault_root=Path("./my-wiki"))
vault = Vault(settings.vault_root)
llm = GeminiClient(api_key=settings.gemini_api_key)
try:
result = await ingest_source(
source=vault.raw / "intro.md",
vault=vault,
llm=llm,
settings=settings,
)
print(f"Ingested: {result.title}")
print(f"Tokens: {result.input_tokens} in / {result.output_tokens} out")
finally:
await llm.close()
asyncio.run(main())
Testing
make install # pip install -e ".[dev]"
make test # unit tests (fast, no network)
make test-integration # real LLM calls (requires GEMINI_API_KEY)
make lint # ruff + mypy
Architecture
src/llm_wiki/
├── config.py Settings via pydantic-settings
├── models.py Pydantic models (IngestResult, Concept, Entity)
├── exceptions.py Custom exceptions
├── dedup.py 4-tier fuzzy matching
├── json_parser.py 3-tier robust JSON parser
├── fix_links.py Post-process broken wikilinks
├── llm/
│ ├── base.py LLMProvider ABC
│ └── gemini.py Async Gemini client
├── vault/
│ ├── paths.py Vault path resolution
│ ├── frontmatter.py YAML frontmatter parsing
│ └── wikilinks.py Wikilink extraction
├── pipeline/
│ ├── ingest.py ingest_source() — 1-shot ingest
│ ├── query.py query_wiki() — Q&A against wiki
│ └── lint.py lint_vault() — structural + semantic
└── cli.py Typer CLI
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file wiki_forge-1.0.0.tar.gz.
File metadata
- Download URL: wiki_forge-1.0.0.tar.gz
- Upload date:
- Size: 74.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7d0b0618d2234b3948361a6a4af399bdd8a6100857cdf2283110e6387711848f
|
|
| MD5 |
01affb714dbf95455d4be235fa4d767d
|
|
| BLAKE2b-256 |
aa3e9448fafcf976d6718d9b0b4721d8052dc52ca197aad345043f6c80d1660b
|
File details
Details for the file wiki_forge-1.0.0-py3-none-any.whl.
File metadata
- Download URL: wiki_forge-1.0.0-py3-none-any.whl
- Upload date:
- Size: 2.4 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
154e2426c0cde6857aa35d3faa904ef0ad9d9dd4fa5b09e7362d15615ef13aaf
|
|
| MD5 |
1f7307ce4e09813f5eaef5f74fd1aaf3
|
|
| BLAKE2b-256 |
a9c618e6c1360c4f14157387e81d4f600aa1c6788907e1bd80ad2a74be19d051
|