Toolkit for building agent-maintained Obsidian-style wikis — link management, linting, document conversion, and agent coordination
Project description
agent-wiki
Toolkit for building LLM-maintained wikis. Handles the plumbing — link management, linting, file operations, document conversion, and agent coordination — so LLMs focus on content.
Based on the LLM Wiki pattern by Andrej Karpathy: instead of RAG (re-deriving knowledge on every query), the LLM incrementally builds and maintains a persistent wiki — a structured, interlinked collection of markdown files. The wiki is a compounding artifact: cross-references are already there, contradictions already flagged, synthesis already current. You curate sources and ask questions; the LLM does the bookkeeping.
Install
pip install agent-wiki
Quick Start
from agent_wiki import WikiRoot
wiki = WikiRoot("/path/to/wiki")
# Health check — find broken links, missing backlinks, orphan pages
issues = wiki.lint()
# Move a page and auto-update every [[link]] across the wiki
wiki.move("topics/Old Name.md", "topics/New Name.md")
# Convert a PDF to structured markdown with images
wiki.convert_pdf("paper.pdf", "processed/paper.md", max_dpi=150)
# Find all pages linking to a topic
wiki.backlinks("Sand Injectites")
# Search for a concept across all pages
wiki.find_references("polygonal fault")
# Wiki statistics
wiki.stats()
CLI
Every operation is also available from the command line — designed for AI agents calling via Bash.
# Lint
agent-wiki --root wiki/ lint
agent-wiki --root wiki/ lint --json # structured output for agents
# File operations (auto-updates all links)
agent-wiki --root wiki/ move old.md new.md
agent-wiki --root wiki/ merge source.md target.md
agent-wiki --root wiki/ rename "Old Name" "New Name"
# Document conversion
agent-wiki --root wiki/ convert pdf paper.pdf processed/paper.md --max-dpi 150
# Search
agent-wiki --root wiki/ backlinks "Page Name"
agent-wiki --root wiki/ find-references "some concept"
agent-wiki --root wiki/ stats
Use --json on any command for machine-readable output.
Features
Link Management
Obsidian-style [[wiki-links]] with full support for [[target|display text]]. The library builds a link graph, resolves links by filename stem (Obsidian's shortest-unique-path matching), and rewrites links automatically when files move.
from agent_wiki.links import parse_links, find_backlinks, rewrite_links
links = parse_links("See [[Topic A]] and [[Topic B|related topic]]")
# → [WikiLink(target="Topic A", ...), WikiLink(target="Topic B", display="related topic", ...)]
rewrite_links(text, "Old Name", "New Name")
# → updates [[Old Name]] → [[New Name]], preserves [[Old Name|display]] → [[New Name|display]]
Linting
Automated wiki health checks that catch real problems:
| Check | Severity | What it detects |
|---|---|---|
| Broken links | error | [[Target]] where no file matches |
| Missing backlinks | warning | Topic lists source but source doesn't link back |
| Orphan pages | warning | Pages with zero inbound links |
| Broken breadcrumbs | error | Navigation chain has dead links |
| Missing frontmatter | error | Required YAML fields missing per page type |
| Missing sections | warning | Topic without ## Sources, source without ## Topics |
| Dispute chronology | warning | Disputed claims with dates out of order |
| Split candidates | info | Pages exceeding 500 lines |
issues = wiki.lint()
for issue in issues:
print(f"[{issue.severity.value}] {issue.file}: {issue.message}")
File Operations
Move, rename, or merge pages — all [[wiki-links]] across the entire wiki are updated automatically.
# Rename a page — finds it by name, renames file, updates all references
wiki.rename("Old Topic Name", "New Topic Name")
# Merge two pages — appends content, redirects all links, deletes source
wiki.merge("sources/duplicate.md", "sources/canonical.md")
Document Conversion
PDF to structured markdown using pymupdf4llm — proper headings, paragraphs, tables, and extracted images. Not flat text.
wiki.convert_pdf(
"paper.pdf",
"processed/paper.md",
max_dpi=150, # image resolution cap
extract_images=True, # images saved to img/ subfolder
)
Stubs for .docx, .pptx, .xlsx conversion are included for future implementation.
Kanban Agent Pipeline
A filesystem-based kanban system for coordinating multiple AI agents. Agents communicate through task cards — lightweight markdown files that move between columns.
from agent_wiki.kanban import create_card, claim, complete, list_cards
# Create a task card (auto-generated by kanban_process)
create_card(
source_file="raw/paper.pdf",
processed_file="processed/paper.md",
kanban_dir="kanban/backlog/",
agent="reader",
)
# Agent claims work (atomic move — prevents race conditions)
card = claim("kanban/backlog/paper.md", "kanban/processing/")
# Agent finishes — move to next stage
complete("kanban/processing/paper.md", "kanban/review/", agent="writer")
Batch processing with kanban_process:
# Scan for new files, convert, create task cards — one call
cards = wiki.kanban_process(
input_dir="raw/",
output_dir="processed/",
completed_dir="./completed", # relative to input_dir
kanban_dir="kanban/backlog/",
)
# → converts new PDFs, moves originals to raw/completed/, creates task cards
Pipeline pattern:
raw/paper.pdf
→ kanban_process() converts + creates card
→ reader agent claims → writes source page → passes to writer
→ writer agent claims → writes topic pages → passes to orchestrator
→ orchestrator reviews → approves to wiki or sends back with actions
No database. The filesystem is the state. Agents coordinate by moving files.
Project Layout
agent-wiki works with any Obsidian-compatible wiki. A typical project:
my-wiki-project/
├── raw/ # source documents (immutable)
│ └── completed/ # originals moved here after processing
├── wiki/ # ← this is your WikiRoot
│ ├── processed/ # converted markdown (auto-generated)
│ ├── sources/ # source pages (one per paper)
│ ├── topics/ # topic pages (synthesized knowledge)
│ ├── kanban/ # task cards for agent coordination
│ │ ├── backlog/
│ │ ├── processing/
│ │ ├── review/
│ │ └── done/
│ ├── index.md
│ └── log.md
├── instructions/ # agent prompts and workflow docs
└── agent-wiki.yaml # optional config
Configuration
Optional agent-wiki.yaml at the project root:
root: wiki/
kanban: wiki/kanban/
processed: wiki/processed/
raw: raw/
completed: raw/completed/
Or configure programmatically:
wiki = WikiRoot("wiki/", kanban_dir="wiki/kanban/", processed_dir="wiki/processed/")
The Pattern
The LLM Wiki pattern has three layers:
- Raw sources — your curated documents. Immutable. The LLM reads but never modifies.
- The wiki — LLM-generated markdown. Summaries, topic pages, cross-references. The LLM owns this entirely.
- The schema — instructions that tell the LLM how the wiki is structured and what workflows to follow.
Three operations:
- Ingest — process a new source into the wiki. Creates a source page, updates topic pages, maintains cross-references.
- Query — answer questions from the wiki. Good answers get filed back as new pages.
- Lint — health-check the wiki. Find broken links, orphan pages, contradictions, missing cross-references.
The human curates sources and asks questions. The LLM does the bookkeeping.
For the full pattern description, see Andrej Karpathy's LLM Wiki gist.
Requirements
- Python 3.12+
- Dependencies:
pymupdf,pymupdf4llm,Pillow
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agent_wiki-0.1.0.tar.gz.
File metadata
- Download URL: agent_wiki-0.1.0.tar.gz
- Upload date:
- Size: 36.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4d26126e360cd7dc34065edccc0519f48f74a4958bcb7fa54f03f03e4e4ebcf2
|
|
| MD5 |
54726f9450d6e76ff60a566d5477ab26
|
|
| BLAKE2b-256 |
a28d6c305fb21f05b166e146b6da292b58414fa7f6f43d13c01e7108ab80177d
|
Provenance
The following attestation bundles were made for agent_wiki-0.1.0.tar.gz:
Publisher:
ci.yml on kkollsga/agent-wiki
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agent_wiki-0.1.0.tar.gz -
Subject digest:
4d26126e360cd7dc34065edccc0519f48f74a4958bcb7fa54f03f03e4e4ebcf2 - Sigstore transparency entry: 1271380721
- Sigstore integration time:
-
Permalink:
kkollsga/agent-wiki@a75fb4fa9fb0f0d2e7c96bad82909075054123db -
Branch / Tag:
refs/heads/main - Owner: https://github.com/kkollsga
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@a75fb4fa9fb0f0d2e7c96bad82909075054123db -
Trigger Event:
push
-
Statement type:
File details
Details for the file agent_wiki-0.1.0-py3-none-any.whl.
File metadata
- Download URL: agent_wiki-0.1.0-py3-none-any.whl
- Upload date:
- Size: 34.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dfa6fc600ac3e31b14435d791bfd9fbe56e0b21457c94e152575f1a3559bfddc
|
|
| MD5 |
30d04a1bc0f07c4a1fc3f5c8a93e5f9f
|
|
| BLAKE2b-256 |
cfcb2c02921c66f6c03fb14bb41a0a4d08c16d3e36925ddf03daa94d9d78befa
|
Provenance
The following attestation bundles were made for agent_wiki-0.1.0-py3-none-any.whl:
Publisher:
ci.yml on kkollsga/agent-wiki
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agent_wiki-0.1.0-py3-none-any.whl -
Subject digest:
dfa6fc600ac3e31b14435d791bfd9fbe56e0b21457c94e152575f1a3559bfddc - Sigstore transparency entry: 1271380733
- Sigstore integration time:
-
Permalink:
kkollsga/agent-wiki@a75fb4fa9fb0f0d2e7c96bad82909075054123db -
Branch / Tag:
refs/heads/main - Owner: https://github.com/kkollsga
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@a75fb4fa9fb0f0d2e7c96bad82909075054123db -
Trigger Event:
push
-
Statement type: