cite-citadel — an LLM-maintained, fully-cited personal wiki in Google OKF, fed by a coding-agent CLI (claude/copilot/gemini), with an MCP search server. Every fact is cited to its source; nothing is invented.
Project description
cite-citadel
A fortress of cited knowledge. An LLM-maintained, fully-cited personal wiki — every fact is attested to its source, nothing is invented.
An LLM-maintained personal wiki in Google's Open Knowledge Format (OKF), with an MCP server so an AI can search and read it — a KISS, pure-Python 3.12 take on Andrej Karpathy's LLM-Wiki pattern.
Drop arbitrary files into raw/ (markdown, code, JSON/CSV, PDF, PowerPoint/Word/Excel —
.pptx/.docx/.xlsx and legacy .ppt/.doc/.xls — even images, in any sub-folder). One
agentic CLI session per source folds it into a cross-linked OKF wiki under wiki/ — routing each
fact to the page it best fits and splitting/merging pages as the corpus grows, rather than making
one page per file. Office files have their text extracted automatically; images are read visually;
a file too big for one context window is folded in over several passes; the same document in two
formats (report.pdf + report.pptx) is ingested once; and any source that can't be ingested is
recorded (with the reason) in wiki/sources/index.md. Every fact is cited back to its raw/
source, and the model uses only what is in raw/. An AI client then queries the synthesized
wiki over MCP instead of re-reading your notes.
The CLI is citadel; the PyPI package is cite-citadel. The wiki/ directory is the
database — no SQLite, no vector store. Ingest runs through a coding-agent CLI you already have
(claude, copilot, or gemini), so it uses your existing subscription and needs no API key.
Three guarantees that hold as the wiki grows (full rules in
citadel/rules/schema.md):
- Stays organized — ingest merges, splits, and deletes pages by fit; it never piles up one page per raw file.
- Links keep working — merges/renames repoint inbound cross-links; any dangling link fails
citadel lint/citadel check. - Honest provenance — raw facts are restated faithfully and cite their source as
[^sN]. A fact the model adds from its own knowledge must be labeled[^llmN], never disguised as a raw citation.
Install
uv sync # creates .venv, installs deps (just mcp + pyyaml) + the citadel CLI
Run commands with the portable invocation that works the same on Linux/macOS/Windows:
uv run python -m citadel <subcommand>
(uv run citadel … is a shorthand but can be blocked by antivirus on Windows; the python -m form
and the bundled .\citadel wrapper need no .exe. Prefer pip? pip install -e '.[dev]' works too.)
Quickstart
Ingest shells out to a coding-agent CLI — install and log into one (default claude: run claude
once and /login). Everything else (search, tags, check, lint, view, MCP) needs no CLI.
cp ~/notes/*.md raw/ # drop in any text-bearing files
uv run python -m citadel ingest # fold new/changed sources into wiki/
uv run python -m citadel search "caffeine" # ranked keyword search (--tag to filter)
uv run python -m citadel view # open the offline, single-file HTML viewer
uv run python -m citadel serve # run the MCP server (stdio)
Two health checks, both offline and CI-friendly:
uv run python -m citadel check # strict per-page gate (fields, citations, links); ingest runs it too
uv run python -m citadel lint # health report (contradictions, orphans, fabricated sources, …)
Ingest is idempotent — a committed wiki/.citadel_ingested.json manifest tracks each source's
hash and the model that imported it — and keeps the wiki in sync when a raw file is edited,
deleted, or moved. Configure the backend in .env (citadel init scaffolds it from the
packaged template, or copy citadel/templates/env.example):
CITADEL_LLM_CLI=claude # claude | copilot | gemini
CITADEL_INGEST_MODEL=sonnet # claude model alias/id
citadel/templates/env.example documents every knob — timeouts,
verbose/transcript debugging, an out-of-workspace wiki//raw/ on a network drive, ingesting a
whole git repo as one source, the wiki's target language (CITADEL_WIKI_LANG, default en),
PDF figure reading (CITADEL_PDF_MODE=text|images), and opt-in persona/style capture
(CITADEL_STYLE_PROFILES).
How it works
Three layers (Karpathy's split; citadel/rules/schema.md has the
authoritative rules, which the ingest agent reads — referenced by path — every run):
raw/— immutable sources; ingest reads but never edits them.wiki/— the LLM-owned OKF bundle: markdown pages with YAML frontmatter, routed by kind intoconcepts/,objects/,systems/,persons/,organizations/,projects/,abbreviations/,misc/, densely cross-linked, each fact carrying a citation. The reservedindex.md,log.md, andsources/index.mdare generated, not authored.citadel/rules/— the schema/rules layer:schema.md(the format contract) +core.md(agent behavior) + per-lifecycletasks/, per-file-typeformats/, and agent-judgedgenres/briefs. Editing them changes how the wiki is built with no code change. The rules live in the package so a pip install carries them; the repo-rootSCHEMA.md/AGENT_INGEST.mdare just pointer stubs.
Per-fact provenance is the load-bearing rule. Every factual sentence ends with a GitHub-Flavored
Markdown footnote, defined in a trailing ## Sources section that links to the originating raw/
file:
Robusta has about twice the caffeine of Arabica.[^s1]
## Sources
[^s1]: [raw/coffee-guide.md](../../raw/coffee-guide.md) — coffee guide (ingested 2026-06-30)
This renders on GitHub, is trivially greppable, and needs zero custom tooling. A claim that can't be
cited is dropped, never invented; conflicting sources produce a > [!CONTRADICTION] callout. The
wiki/ folder also opens as-is as an Obsidian vault.
Example corpus
The bundled raw/ is a deliberately overlapping coffee + tea corpus — 10 files in mixed styles
(reference, prose, lab notes, FAQ, brand blog) with facts that repeat, contradict, and hide in one
place, plus one deliberately-false sourced claim. Run uv run python -m citadel ingest and watch the
wiki reorganize itself. The verify-example skill (.claude/skills/verify-example/) ingests it and
grades the result against a ground-truth answer key — an end-to-end test of the three guarantees.
See the result without running anything. Browse the generated demo wiki on GitHub at
wiki/index.md — GitHub renders the OKF pages natively, so the [^sN] citations,
cross-links, glossary, and > [!CONTRADICTION] callouts all show inline. For the richer, interactive
view — the cross-link graph, tags, and the cited raw sources embedded — open the live demo at
markusneusinger.github.io/cite-citadel, the
offline single-file viewer regenerated from the wiki on every push.
MCP server
citadel serve exposes seven tools over stdio: wiki_search, wiki_read, wiki_index,
wiki_sources, wiki_tags, wiki_validate (read-only), and wiki_ingest (the only mutating one).
Wire it into an MCP client (e.g. Claude Desktop):
{
"mcpServers": {
"citadel": {
"command": "uv",
"args": ["run", "python", "-m", "citadel", "serve"],
"env": { "CITADEL_LLM_CLI": "claude", "CITADEL_INGEST_MODEL": "sonnet" }
}
}
}
An AI can then wiki_index() to orient, wiki_search(...) to find pages, and wiki_read(...) to
pull full cited context — answering from your synthesized wiki instead of re-retrieving documents.
Reference
citadel/rules/README.md— index of the rules tree the ingest agent follows:schema.md(structure, routing, and provenance rules),core.md(operational behavior), plus thetasks/,formats/, andgenres/briefs.citadel/templates/env.example— every configuration knob (thecitadel init.envtemplate; the repo-root.env.exampleis a pointer stub).docs/karpathy-llm-wiki.md·docs/okf-reference.md— the pattern and the format.CLAUDE.md— architecture notes for contributors.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cite_citadel-0.1.0.tar.gz.
File metadata
- Download URL: cite_citadel-0.1.0.tar.gz
- Upload date:
- Size: 406.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ad21f5642d04b8a8c8eb8187650c9360d78680edbbfbbb8b97a7e703acd06c90
|
|
| MD5 |
d3ff2729e22a96d6b4ae62426d002c32
|
|
| BLAKE2b-256 |
8096fb0e6bdf1656e7e281d0eab5b5d48ab2c9088c1ff326c0bee3532d62858b
|
Provenance
The following attestation bundles were made for cite_citadel-0.1.0.tar.gz:
Publisher:
release.yml on MarkusNeusinger/cite-citadel
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cite_citadel-0.1.0.tar.gz -
Subject digest:
ad21f5642d04b8a8c8eb8187650c9360d78680edbbfbbb8b97a7e703acd06c90 - Sigstore transparency entry: 2050146103
- Sigstore integration time:
-
Permalink:
MarkusNeusinger/cite-citadel@145079c35444f8260025c1449639545fc6576523 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/MarkusNeusinger
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@145079c35444f8260025c1449639545fc6576523 -
Trigger Event:
push
-
Statement type:
File details
Details for the file cite_citadel-0.1.0-py3-none-any.whl.
File metadata
- Download URL: cite_citadel-0.1.0-py3-none-any.whl
- Upload date:
- Size: 184.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
686ada7c3b676a000db2110ee3228dd4c804a7bc3a56aa3327cfda815ce55199
|
|
| MD5 |
bad37297c160822a03a36c4b829b9732
|
|
| BLAKE2b-256 |
3f4f92ce8708f1369c48453ecc65ab46e047e1711ecc317adc363daa9ea13651
|
Provenance
The following attestation bundles were made for cite_citadel-0.1.0-py3-none-any.whl:
Publisher:
release.yml on MarkusNeusinger/cite-citadel
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cite_citadel-0.1.0-py3-none-any.whl -
Subject digest:
686ada7c3b676a000db2110ee3228dd4c804a7bc3a56aa3327cfda815ce55199 - Sigstore transparency entry: 2050146723
- Sigstore integration time:
-
Permalink:
MarkusNeusinger/cite-citadel@145079c35444f8260025c1449639545fc6576523 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/MarkusNeusinger
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@145079c35444f8260025c1449639545fc6576523 -
Trigger Event:
push
-
Statement type: