Open source knowledge management pipeline — append-only intake, tiered compilation, configurable schemas

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

TriKro

These details have not been verified by PyPI

Project description

Athenaeum

Athena with her owl companion, holding an open book showing a knowledge graph

Production-grade agentic memory for teams deploying multiple AI agents. Athenaeum follows trunk-style development with develop as the active branch and main as the released-revision pointer. Append-only intake, a tiered librarian that compiles raw observations into a trustworthy wiki, and a sidecar that makes recall happen passively on every turn.

Is this for me? If you're running more than one agent on shared knowledge — or if you want agents and humans reading and writing the same institutional memory — yes. If you're building a single-user chatbot, mem0 or Letta may be a better fit.

Why Athenaeum

Four design choices separate a production memory system from a single-user markdown file. Each one fixes something that quietly breaks when a team scales past one agent:

Sources as first-class objects — every claim carries provenance, the way Wikipedia does. An unfootnoted fact is an assertion.
The librarian — a tiered compilation pipeline — agents can only append to raw intake. A separate compiler is the only writer to the wiki. Safety from structure, not trust.
Passive recall — a hybrid FTS5+vector search fires on every turn and injects breadcrumbs into context. The agent doesn't have to remember to look.
An editable observation filter — what the agent saves is governed by a prompt you can read, edit, and audit. Not a black box.

Full rationale, comparison with alternatives (Claude memory, Anthropic's memory tool, RAG, Karpathy's gist, mem0/Letta/Zep/Cognee), and the lessons from running it on our own operations live in docs/why-athenaeum.md. For the companion blog post: What We Learned Running Our Own Operations on Agentic Memory.

Installation

pip install athenaeum

Quick start

# Initialize a knowledge directory
athenaeum init                  # default: ~/knowledge
athenaeum init --path ~/my-knowledge

# Run the librarian (compile raw intake → wiki entities)
athenaeum run
athenaeum run --dry-run         # inspect without writing

# Check status
athenaeum status

Full run with custom paths and budgets:

athenaeum run \
  --raw-root ~/knowledge/raw \
  --wiki-root ~/knowledge/wiki \
  --knowledge-root ~/knowledge \
  --max-files 50 \
  --max-api-calls 200 \
  --verbose

MCP memory server

Athenaeum ships an MCP server exposing remember and recall tools so AI agents can write to raw intake and search the compiled wiki:

pip install athenaeum[mcp]
athenaeum serve --path ~/knowledge

# Smoke test the round-trip without a live session
athenaeum test-mcp

Claude Code integration. Add to your MCP config and it auto-starts with every session:

claude mcp add --scope user athenaeum -- athenaeum serve --path ~/knowledge

Example round-trip:

User: Tristan's partner is Amanda; they met at Stanford GSB.

(Claude calls remember(content="Tristan's partner is Amanda; they met at Stanford GSB.", source="claude-session"))

A raw observation lands in raw/claude-session/20260417T…-…md. On the next athenaeum run, the pipeline compiles it into Tristan's wiki entity (under "Key Contacts") and Amanda's own entity if she doesn't exist yet. Later sessions can ask "who is Amanda?" and recall returns the compiled page.

Answering pending questions

When Tier 3 can't resolve an ambiguity or a principled contradiction, the librarian escalates to wiki/_pending_questions.md. Each escalation lands as a block like:

## [2026-04-20] Entity: "Acme Corp" (from sessions/20240406T120000Z-aabb0011.md)
- [ ] Is Acme still Series A after the 2026 recapitalisation?
**Conflict type**: principled
**Description**: Prior wiki says Series A; the 2026-04 raw file implies Series B.

You resolve a question one of two ways — pick whichever fits your workflow:

Option 1 — Edit the file directly

Flip [ ] to [x] on the checkbox line and type your answer below the checkbox (above or below the conflict-type / description lines — either works; the parser strips those metadata lines when extracting the answer):

## [2026-04-20] Entity: "Acme Corp" (from sessions/20240406T120000Z-aabb0011.md)
- [x] Is Acme still Series A after the 2026 recapitalisation?

They closed Series B on 2026-03-12, led by Acme Growth Partners.
The 2026-04 raw file is correct; the prior wiki entry is stale.

**Conflict type**: principled
**Description**: Prior wiki says Series A; the 2026-04 raw file implies Series B.

Option 2 — Use the MCP tool

For containerized agents that can't touch the filesystem, athenaeum serve exposes two tools:

list_pending_questions() returns unanswered blocks as JSON — each item carries a stable id derived from the header + question text.
resolve_question(id, answer) flips the checkbox and writes the answer body under it. It does not archive on its own — archival runs on the next ingest-answers pass.

Step 2 — ingest the answers

Either way, run:

athenaeum ingest-answers --path ~/knowledge

Each [x] block is rewritten as a raw intake file under raw/answers/{timestamp}-{entity-slug}.md with frontmatter linking back to the original source, then moved into wiki/_pending_questions_archive.md (newest-first, append-only — answered blocks are never deleted, only moved). The next athenaeum run picks the raw file up like any other intake and folds the answer into the wiki entity.

Re-running with no new [x] blocks is a no-op. Malformed blocks are preserved in place and logged to stderr, so a corrupt single entry cannot poison the rest of the file.

Transparent sidecar (Claude Code hooks)

For a fully passive experience where Claude auto-recalls relevant context on every prompt and saves observations without explicit commands, configure Claude Code hooks:

Copy the example hooks from examples/claude-code/ to your scripts directory.
Add hook entries to ~/.claude/settings.json (see examples/claude-code/settings-snippet.json).
Add CLAUDE.md instructions for proactive memory (see examples/claude-code/CLAUDE.md.example).

This gives you:

Auto-recall — an FTS5 index is built at session start (~300ms); each user message triggers a <50ms search that injects relevant wiki pages into context.
Auto-remember — Claude proactively saves important facts without being asked.
Context checkpointing — observations are saved before context-window compaction.

Full setup guide, smoke test, and environment-variable reference: examples/claude-code/README.md.

Integrations

Claude Code auto-memory — bridge ~/.claude/projects/<scope>/memory/ into Athenaeum's raw/ intake so the librarian can cluster, merge, and contradiction-check Claude Code's durable memory alongside other sources. See docs/integrations/claude-code.md.
Contradiction detection — pipeline overview, cross-scope modes, source-precedence taxonomy, configuration reference, and cost model for the auto-memory contradiction path. See docs/contradiction-detection.md.

Vector search (optional)

Athenaeum supports a vector search backend (chromadb + all-MiniLM-L6-v2) for semantic recall alongside the default FTS5 keyword backend. The recall hook runs a hybrid FTS5 + vector merge when vector is configured — each backend rescues a failure class the other has (short-query proper-noun collisions for vector; no-lexical-overlap semantic queries for FTS5).

pip install athenaeum[vector]

Enable it in athenaeum.yaml:

search_backend: vector

Full walkthrough and the four invariants a future simplification must not remove: docs/recall-architecture.md.

Query-topic extraction (optional)

athenaeum query-topics "your prompt" runs a Haiku classifier that returns substantive topics and ignores meta-instructions:

$ athenaeum query-topics "Without calling any tools, quote the block about Return Path verbatim"
Return Path

The naive regex+stopword fallback returns block,calling,quote,return,tools,verbatim,without — burying "Return Path" behind meta-instruction tokens. The example recall hook uses query-topics to rescue named-entity recall on instruction-heavy prompts and falls back silently to the regex extractor if the API key or CLI is unavailable.

Environment variables

Variable	Required	Description
`ANTHROPIC_API_KEY`	Yes (unless `--dry-run`)	API key for Tier 2/3 LLM calls
`ATHENAEUM_CLASSIFY_MODEL`	No	Override Tier 2 model (default: `claude-haiku-4-5-20251001`)
`ATHENAEUM_WRITE_MODEL`	No	Override Tier 3 model (default: `claude-sonnet-4-6`)
`ATHENAEUM_TOPIC_MODEL`	No	Override query-topic model (default: `claude-haiku-4-5-20251001`)
`ATHENAEUM_OP_KEY_PATH`	No	1Password path for the session-start `ANTHROPIC_API_KEY` bootstrap (default: `op://Agent Tools/Anthropic API Key/credential`)
`AUTO_RECALL`	No	Per-turn recall on/off (hook shell env; overrides `athenaeum.yaml`'s `auto_recall`). Default: `true`
`SEARCH_BACKEND`	No	`fts5` or `vector` (hook shell env; overrides `athenaeum.yaml`'s `search_backend`). Default: `fts5`
`ATHENAEUM_HOOK_DEBUG`	No	Set to `1` to log vector-backend errors from `user-prompt-recall.sh` to stderr

Shell-env overrides. AUTO_RECALL and SEARCH_BACKEND are read from the shell environment after the hook sources ~/.cache/athenaeum/config.env, so exports in your shell profile beat the cached config. Intentional (lets you A/B-test a backend without editing athenaeum.yaml), but it's the first thing to check when the hook "ignores" a config change.

Claude Code auth caveat. Claude Code's own CLAUDE_CODE_OAUTH_TOKEN is scoped to its inference endpoint, and the Anthropic Messages API rejects it with 401 OAuth authentication is currently not supported. The pipeline and example hooks need a separate console API key — see docs/recall-architecture.md for the 1Password bootstrap pattern.

Data formats

Raw intake lives in raw/{source}/*.md with the naming convention {timestamp}-{uuid8}.md (e.g., 20240406T120000Z-aabb0011.md). Each file is a plain markdown document containing observations, notes, or session transcripts. The {source} directory identifies the origin (e.g., sessions, imports).

Wiki entity pages live in wiki/ with YAML frontmatter:

---
uid: a1b2c3d4
type: person
name: Alice Zhang
aliases: [Alice]
access: internal
tags: [active]
created: '2024-04-06'
updated: '2024-04-06'
---

Entities are indexed in wiki/_index.md grouped by type. Conflicts requiring human review are appended to wiki/_pending_questions.md. Each run logs token usage and estimated costs at the end.

Known limitations

Athenaeum is pre-1.0. These trade-offs are intentional for the current release line:

No retrieval benchmarks yet. The hybrid-search claim rests on concrete failure modes (proper-noun collision, no-overlap semantic queries) and production use — not a published eval against mem0 / Letta / Zep / Cognee. If you need benchmarked recall@k on a closed corpus, pick a tool that publishes numbers. If you want a knowledge base that survives your tool choices, this is for you. PRs adding an eval harness are very welcome.
FTS5 index rebuilds are non-atomic and unlocked. A shell hook and the librarian run rebuilding simultaneously can race; the window is small and single-user wikis do not hit it in practice, but multi-writer safety is v0.3 work. Workaround: don't invoke athenaeum rebuild-index and athenaeum run concurrently on the same $KNOWLEDGE_ROOT.
The keyword search backend is a scan-on-query fallback. It reads every wiki page on every query; fine under ~1,000 entities, painful past that. Use search_backend: fts5 (default in the CLI and hooks) for any non-trivial wiki. The keyword backend exists as a zero-dependency baseline for tests and bootstrap.
Tier 4 (human escalation) is a file, not a workflow. Conflicts land in wiki/_pending_questions.md; you read it and decide. No PR-opening, no Slack integration, no UI — on purpose, for now.

Development

git clone https://github.com/Kromatic-Innovation/athenaeum.git
cd athenaeum
pip install -e ".[dev]"

pytest tests/ -v
ruff check src/ tests/

Branch flow

Athenaeum uses a trunk-style branch model:

develop is the active development branch and the GitHub default. All pull requests target develop.
main carries the most recent released revision. Release tags (vX.Y.Z) live on main and trigger the PyPI release workflow.

Most users should install via pip install athenaeum (above). To work from source against the latest released revision instead of the active branch, clone and check out the latest tag:

git clone https://github.com/Kromatic-Innovation/athenaeum.git
cd athenaeum
git checkout "$(git describe --tags --abbrev=0)"

See CONTRIBUTING.md for the full promotion flow.

Getting help

Rolling this out on a team? Open an issue or reach out via kromatic.com. We talk to teams working through agent-memory rollouts often and are happy to point at whatever's useful.

License

Apache 2.0 — see LICENSE.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

TriKro

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.4.1

May 22, 2026

0.4.0

May 11, 2026

0.3.1

Apr 21, 2026

0.3.0

Apr 21, 2026

0.2.3

Apr 21, 2026

0.2.2

Apr 17, 2026

0.2.1

Apr 17, 2026

0.2.0

Apr 17, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

athenaeum-0.4.1.tar.gz (1.0 MB view details)

Uploaded May 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

athenaeum-0.4.1-py3-none-any.whl (139.7 kB view details)

Uploaded May 22, 2026 Python 3

File details

Details for the file athenaeum-0.4.1.tar.gz.

File metadata

Download URL: athenaeum-0.4.1.tar.gz
Upload date: May 22, 2026
Size: 1.0 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for athenaeum-0.4.1.tar.gz
Algorithm	Hash digest
SHA256	`8d4dd86a13785ac2689f7f19a649098efd9e9789e55c69b92cf04fe1067d88f0`
MD5	`673298f44853fb939b00f8393e97312a`
BLAKE2b-256	`2505483c13ae5459649d7bacef22982521716a74c803f7945457052a6850b9a6`

See more details on using hashes here.

Provenance

The following attestation bundles were made for athenaeum-0.4.1.tar.gz:

Publisher: release.yml on Kromatic-Innovation/athenaeum

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: athenaeum-0.4.1.tar.gz
- Subject digest: 8d4dd86a13785ac2689f7f19a649098efd9e9789e55c69b92cf04fe1067d88f0
- Sigstore transparency entry: 1608268373
- Sigstore integration time: May 22, 2026
Source repository:
- Permalink: Kromatic-Innovation/athenaeum@6ae914d4b0245cc69f47711cbfdf157fc1636b28
- Branch / Tag: refs/tags/v0.4.1
- Owner: https://github.com/Kromatic-Innovation
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@6ae914d4b0245cc69f47711cbfdf157fc1636b28
- Trigger Event: push

File details

Details for the file athenaeum-0.4.1-py3-none-any.whl.

File metadata

Download URL: athenaeum-0.4.1-py3-none-any.whl
Upload date: May 22, 2026
Size: 139.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for athenaeum-0.4.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`da27276de4c89d20d6ecff2a3e30f8809e533ac50545116d5518c5c9fb2ca43b`
MD5	`6079a7f7b8769ca61d94e2c820f4fcfc`
BLAKE2b-256	`018f7ae20274883d12cbb9e561989162a091843853ca31772f296b703f2acaa3`

See more details on using hashes here.

Provenance

The following attestation bundles were made for athenaeum-0.4.1-py3-none-any.whl:

Publisher: release.yml on Kromatic-Innovation/athenaeum

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: athenaeum-0.4.1-py3-none-any.whl
- Subject digest: da27276de4c89d20d6ecff2a3e30f8809e533ac50545116d5518c5c9fb2ca43b
- Sigstore transparency entry: 1608268440
- Sigstore integration time: May 22, 2026
Source repository:
- Permalink: Kromatic-Innovation/athenaeum@6ae914d4b0245cc69f47711cbfdf157fc1636b28
- Branch / Tag: refs/tags/v0.4.1
- Owner: https://github.com/Kromatic-Innovation
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@6ae914d4b0245cc69f47711cbfdf157fc1636b28
- Trigger Event: push

athenaeum 0.4.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Athenaeum

Why Athenaeum

Installation

Quick start

MCP memory server

Answering pending questions

Option 1 — Edit the file directly

Option 2 — Use the MCP tool

Step 2 — ingest the answers

Transparent sidecar (Claude Code hooks)

Integrations

Vector search (optional)

Query-topic extraction (optional)

Environment variables

Data formats

Known limitations

Development

Branch flow

Getting help

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance