Skip to main content

cite-citadel — an LLM-maintained, fully-cited personal wiki in Google OKF, fed by a coding-agent CLI (claude/copilot/gemini), with an MCP search server. Every fact is cited to its source; nothing is invented.

Project description

cite-citadel

A fortress of cited knowledge. An LLM-maintained, fully-cited personal wiki — every fact is attested to its source, nothing is invented.

An LLM-maintained personal wiki in Google's Open Knowledge Format (OKF), with an MCP server so an AI can search and read it — a KISS, pure-Python 3.12 take on Andrej Karpathy's LLM-Wiki pattern.

Drop arbitrary files into raw/ (markdown, code, JSON/CSV, PDF, PowerPoint/Word/Excel — .pptx/.docx/.xlsx and legacy .ppt/.doc/.xls — even images, in any sub-folder). One agentic CLI session per source folds it into a cross-linked OKF wiki under wiki/routing each fact to the page it best fits and splitting/merging pages as the corpus grows, rather than making one page per file. Office files have their text extracted automatically; images are read visually; a file too big for one context window is folded in over several passes; the same document in two formats (report.pdf + report.pptx) is ingested once; and any source that can't be ingested is recorded (with the reason) in wiki/sources/index.md. Every fact is cited back to its raw/ source, and the model uses only what is in raw/. An AI client then queries the synthesized wiki over MCP instead of re-reading your notes.

The CLI is citadel; the PyPI package is cite-citadel. The wiki/ directory is the database — no SQLite, no vector store. Ingest runs through a coding-agent CLI you already have (claude, copilot, or gemini), so it uses your existing subscription and needs no API key.

Three guarantees that hold as the wiki grows (full rules in citadel/rules/schema.md):

  • Stays organized — ingest merges, splits, and deletes pages by fit; it never piles up one page per raw file.
  • Links keep working — merges/renames repoint inbound cross-links; any dangling link fails citadel lint / citadel check.
  • Honest provenance — raw facts are restated faithfully and cite their source as [^sN]. A fact the model adds from its own knowledge must be labeled [^llmN], never disguised as a raw citation.

Install

uv sync   # creates .venv, installs deps (just mcp + pyyaml) + the citadel CLI

Run commands with the portable invocation that works the same on Linux/macOS/Windows:

uv run python -m citadel <subcommand>

(uv run citadel … is a shorthand but can be blocked by antivirus on Windows; the python -m form and the bundled .\citadel wrapper need no .exe. Prefer pip? pip install -e '.[dev]' works too.)

Quickstart

Ingest shells out to a coding-agent CLI — install and log into one (default claude: run claude once and /login). Everything else (search, tags, check, lint, view, MCP) needs no CLI.

cp ~/notes/*.md raw/                          # drop in any text-bearing files
uv run python -m citadel ingest               # fold new/changed sources into wiki/
uv run python -m citadel search "caffeine"    # ranked keyword search (--tag to filter)
uv run python -m citadel view                 # open the offline, single-file HTML viewer
uv run python -m citadel serve                # run the MCP server (stdio)

Two health checks, both offline and CI-friendly:

uv run python -m citadel check    # strict per-page gate (fields, citations, links); ingest runs it too
uv run python -m citadel lint     # health report (contradictions, orphans, fabricated sources, …)

Ingest is idempotent — a committed wiki/.citadel_ingested.json manifest tracks each source's hash and the model that imported it — and keeps the wiki in sync when a raw file is edited, deleted, or moved. Configure the backend in .env (citadel init scaffolds it from the packaged template, or copy citadel/templates/env.example):

CITADEL_LLM_CLI=claude        # claude | copilot | gemini
CITADEL_INGEST_MODEL=sonnet   # claude model alias/id

citadel/templates/env.example documents every knob — timeouts, verbose/transcript debugging, an out-of-workspace wiki//raw/ on a network drive, ingesting a whole git repo as one source, the wiki's target language (CITADEL_WIKI_LANG, default en), PDF figure reading (CITADEL_PDF_MODE=text|images), and opt-in persona/style capture (CITADEL_STYLE_PROFILES).

How it works

Three layers (Karpathy's split; citadel/rules/schema.md has the authoritative rules, which the ingest agent reads — referenced by path — every run):

  1. raw/ — immutable sources; ingest reads but never edits them.
  2. wiki/ — the LLM-owned OKF bundle: markdown pages with YAML frontmatter, routed by kind into concepts/, objects/, systems/, persons/, organizations/, projects/, abbreviations/, misc/, densely cross-linked, each fact carrying a citation. The reserved index.md, log.md, and sources/index.md are generated, not authored.
  3. citadel/rules/ — the schema/rules layer: schema.md (the format contract) + core.md (agent behavior) + per-lifecycle tasks/, per-file-type formats/, and agent-judged genres/ briefs. Editing them changes how the wiki is built with no code change. The rules live in the package so a pip install carries them; the repo-root SCHEMA.md/AGENT_INGEST.md are just pointer stubs.

Per-fact provenance is the load-bearing rule. Every factual sentence ends with a GitHub-Flavored Markdown footnote, defined in a trailing ## Sources section that links to the originating raw/ file:

Robusta has about twice the caffeine of Arabica.[^s1]

## Sources

[^s1]: [raw/coffee-guide.md](../../raw/coffee-guide.md) — coffee guide (ingested 2026-06-30)

This renders on GitHub, is trivially greppable, and needs zero custom tooling. A claim that can't be cited is dropped, never invented; conflicting sources produce a > [!CONTRADICTION] callout. The wiki/ folder also opens as-is as an Obsidian vault.

Example corpus

The bundled raw/ is a deliberately overlapping coffee + tea corpus — 10 files in mixed styles (reference, prose, lab notes, FAQ, brand blog) with facts that repeat, contradict, and hide in one place, plus one deliberately-false sourced claim. Run uv run python -m citadel ingest and watch the wiki reorganize itself. The verify-example skill (.claude/skills/verify-example/) ingests it and grades the result against a ground-truth answer key — an end-to-end test of the three guarantees.

See the result without running anything. Browse the generated demo wiki on GitHub at wiki/index.md — GitHub renders the OKF pages natively, so the [^sN] citations, cross-links, glossary, and > [!CONTRADICTION] callouts all show inline. For the richer, interactive view — the cross-link graph, tags, and the cited raw sources embedded — open the live demo at markusneusinger.github.io/cite-citadel, the offline single-file viewer regenerated from the wiki on every push.

MCP server

citadel serve exposes seven tools over stdio: wiki_search, wiki_read, wiki_index, wiki_sources, wiki_tags, wiki_validate (read-only), and wiki_ingest (the only mutating one). Wire it into an MCP client (e.g. Claude Desktop):

{
  "mcpServers": {
    "citadel": {
      "command": "uv",
      "args": ["run", "python", "-m", "citadel", "serve"],
      "env": { "CITADEL_LLM_CLI": "claude", "CITADEL_INGEST_MODEL": "sonnet" }
    }
  }
}

An AI can then wiki_index() to orient, wiki_search(...) to find pages, and wiki_read(...) to pull full cited context — answering from your synthesized wiki instead of re-retrieving documents.

Reference

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cite_citadel-0.1.0.tar.gz (406.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cite_citadel-0.1.0-py3-none-any.whl (184.6 kB view details)

Uploaded Python 3

File details

Details for the file cite_citadel-0.1.0.tar.gz.

File metadata

  • Download URL: cite_citadel-0.1.0.tar.gz
  • Upload date:
  • Size: 406.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cite_citadel-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ad21f5642d04b8a8c8eb8187650c9360d78680edbbfbbb8b97a7e703acd06c90
MD5 d3ff2729e22a96d6b4ae62426d002c32
BLAKE2b-256 8096fb0e6bdf1656e7e281d0eab5b5d48ab2c9088c1ff326c0bee3532d62858b

See more details on using hashes here.

Provenance

The following attestation bundles were made for cite_citadel-0.1.0.tar.gz:

Publisher: release.yml on MarkusNeusinger/cite-citadel

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cite_citadel-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: cite_citadel-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 184.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cite_citadel-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 686ada7c3b676a000db2110ee3228dd4c804a7bc3a56aa3327cfda815ce55199
MD5 bad37297c160822a03a36c4b829b9732
BLAKE2b-256 3f4f92ce8708f1369c48453ecc65ab46e047e1711ecc317adc363daa9ea13651

See more details on using hashes here.

Provenance

The following attestation bundles were made for cite_citadel-0.1.0-py3-none-any.whl:

Publisher: release.yml on MarkusNeusinger/cite-citadel

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page