Skip to main content

A shared knowledge base that keeps AI tools informed of your team's project-specific information.

Project description

domesday-book

A shared knowledge base that keeps AI tools informed of your team's project-specific information.

PyPI Python version

ty Coverage CI/CD GitHub issues

The natives call this book "Domesday" ... concerning the matters contained in the book, its word cannot be denied or set aside.
wikipedia.org/wiki/Domesday_Book

Status

๐Ÿšง Ongoing development! ๐Ÿšง

Core functionality implemented as a working protoype. See Roadmap for next steps.

Why this exists

Research teams accumulate critical tacit knowledge โ€” processing caveats, data access optimizations, troubleshooting tips โ€” that lives in Teams conversations, scattered notes, and people's heads. This is information AI tools (and other people) need access to.

We need a system where:

  • Adding knowledge is as easy as pasting a text snippet into a box
  • The system automatically processes new entries
  • Multiple team members can contribute and curate entries
  • The knowledge base is queryable by AI tools like Claude, giving answers with citations to the original snippets

Comparison to alternatives

Approach Pros Cons
Wiki / shared docs (Confluence, Notion) Familiar, collaborative Keyword search only, full-page retrieval adds noise and cost
Single document in context (CLAUDE.md, system prompt) Simple, no infra, fast Doesn't scale, context bloat increases cost and noise
Built-in knowledge bases (Claude Projects) Integrated, zero setup Vendor lock-in, not programmable, not open
Generic RAG (LangChain + vector DB) Flexible, open High setup and maintenance cost, custom glue code
domesday-book Easy setup, multi-project, works with any AI tool requires API keys for faster embedding/LLM

Quickstart

Note: in this early-stage prototype, interaction with the system is via the CLI, however that is not the intended end-user interface.

# Install
uv tool install domesday[voyage,mcp]

# Set API keys
export VOYAGE_API_KEY=voy-...
export ANTHROPIC_API_KEY=sk-ant-...

# Add a snippet to a project
domes -p vbo add "The VBO dataset has an off-by-one error in timestamps before 2023-06-01."

# Bulk ingest a folder
domes -p vbo ingest ./project-notes/ --author ben

# Semantic search within a project to find matching snippets (retrieval only)
domes -p vbo search "VBO timestamp issues"

# Ask a question (retrieve matching snippets โ†’ LLM generates answer with citations)
domes -p vbo ask "What are the known caveats with the dataset?"

# Actual answer from Claude Sonnet 4.6
# **VBO Dataset Timestamp Error**: The VBO dataset has an **off-by-one error in timestamps** for any data dated **before 2023-06-01**. [snippet-1fffb1]  

# Search across all projects
domes search "timestamp bugs" --all-projects

# Browse and inspect
domes projects             # list all projects with snippet counts
domes -p vbo list          # recent snippets in a project
domes stats --all-projects # stats across everything

Development

# clone repo and install:
uv sync --all-extras

# run tests:
uv run task test

How it works

Add snippet (paste/CLI/MCP)
  โ†’ Store raw text + metadata (SQLite)
  โ†’ Chunk (prose/code-aware, ~400 tokens)
  โ†’ Embed (Voyage / OpenAI / local model)
  โ†’ Index (ChromaDB vector store)

Ask a question (CLI/MCP/API)
  โ†’ Embed query
  โ†’ Vector similarity search (cosine, with score threshold)
  โ†’ [Optional] LLM reranker filters irrelevant results
  โ†’ Format context with author, date, tags
  โ†’ Generate answer via Claude with inline citations

Every backend is behind a Protocol interface โ€” swap storage, embedding, or generation by changing config. See Architecture for details.

Projects

A single domesday instance can hold multiple projects. Each snippet belongs to exactly one project. Queries are scoped to a project by default, preventing cross-contamination between unrelated knowledge bases.

# Set a default project in config
# domesday.toml: default_project = "vbo"

# Or specify per-command (--project / -p goes before the subcommand)
domes -p vbo add "some caveat"
domes -p ephys-rig add "different caveat"

# Search within a project
domes -p vbo search "timing issues"

# Search across everything
domes search "timing issues" --all-projects

# See what projects exist
domes projects

# Rename a project
domes rename-project old-name new-name

The --project flag (or -p) can also be set at the top level, applying to all subcommands:

domes -p vbo add "some caveat"
domes -p vbo search "timing"
domes -p vbo ask "what are the known issues?"

For MCP, pass the project in tool arguments, or set DOMESDAY_DEFAULT_PROJECT in the server environment.

Configuration

Place domesday.toml in your project root:

default_project = "main"      # used when --project is not specified

[embedder]
backend = "voyage"             # voyage | openai | local
model = "voyage-4-large"

[generator]
backend = "claude"
model = "claude-sonnet-4-6"

[chunker]
max_tokens = 400
overlap_tokens = 50

[retrieval]
min_score = 0.3               # cosine similarity threshold

[reranker]
enabled = false               # LLM-based relevance filtering (adds latency)
model = "claude-haiku-4-5"
relevance_threshold = 0.5

Environment variables override config: DOMESDAY_EMBEDDER__BACKEND, DOMESDAY_EMBEDDER__MODEL, DOMESDAY_GENERATOR__MODEL.

CLI reference

All commands accept --project / -p to scope to a specific project. This can also be set at the top level: domes -p myproject <command>.

Use --verbose / -v for INFO-level logs or --debug / -d for full DEBUG output:

domes -v search "timestamp issues"     # see search flow
domes -d ingest ./notes/               # see every chunk and embedding call
Command Description
domes add "text" Add a snippet (also accepts --file, stdin, or opens $EDITOR)
domes add --author ben --tags "vbo,bug" Add with metadata
domes -p myproject ingest ./folder/ Bulk ingest files into a project
domes search "query" Semantic search within the current project
domes search "query" --all-projects Search across all projects
domes ask "question" Retrieve relevant snippets then generate an answer with citations
domes ask "question" --show-sources Also print which snippets were used
domes list Show recent snippets in current project
domes list --all-projects Show recent snippets across all projects
domes projects List all projects with snippet counts
domes rename-project old new Rename a project across all stores
domes stats Show stats for current project
domes stats --all-projects Show stats across all projects

MCP integration

domesday exposes itself as an MCP server, making the knowledge base available from Claude Desktop, Cursor, VS Code, or any MCP-compatible client.

Local (stdio) โ€” add to claude_desktop_config.json:

{
  "mcpServers": {
    "domesday": {
      "command": "python",
      "args": ["-m", "domesday.mcp_server"],
      "env": {
        "DOMESDAY_DATA_DIR": "/absolute/path/to/data",
        "DOMESDAY_DEFAULT_PROJECT": "vbo",
        "VOYAGE_API_KEY": "voy-...",
        "ANTHROPIC_API_KEY": "sk-ant-..."
      }
    }
  }
}

Remote (SSE) โ€” for team access:

{
  "mcpServers": {
    "domesday": {
      "url": "https://your-server.internal:8080/mcp/sse"
    }
  }
}

Available MCP tools:

Tool Description
search_knowledge(query, project?, n_results?, tags?) Semantic search over snippets
add_snippet(text, project?, author?, tags?) Add new knowledge from any client
get_snippet(snippet_id) Retrieve a snippet by full or short (8-char) ID
list_recent(n?, project?, author?) Browse recent additions
list_projects() List all projects with snippet counts
rename_project(old_name, new_name) Rename a project across all stores
ask(question, project?, n_context?) Retrieve relevant context and generate an answer with citations

All tools accept an optional project parameter. Pass "all" to search across all projects.

Evaluation

domesday includes an evaluation framework for measuring retrieval quality and generation faithfulness. See Evaluation for full details.

# Run retrieval eval against test corpus
python -m domesday.eval.runner

# Also judge generation quality with Haiku
python -m domesday.eval.runner --judge

# Parameter sweep (min_score, k, chunk size, overlap)
python -m domesday.eval.runner --sweep --quick

# Interactive: inspect individual queries and results
python -m domesday.eval.runner -i

Project structure

domesday/
โ”œโ”€โ”€ pyproject.toml
โ”œโ”€โ”€ domesday.toml
โ”œโ”€โ”€ domesday/
โ”‚   โ”œโ”€โ”€ core/
โ”‚   โ”‚   โ”œโ”€โ”€ models.py           # Snippet, Chunk, SearchResult, RAGResponse
โ”‚   โ”‚   โ”œโ”€โ”€ protocols.py        # Swappable interfaces for all backends
โ”‚   โ”‚   โ””โ”€โ”€ pipeline.py         # Orchestrator: add, ingest, search, ask
โ”‚   โ”œโ”€โ”€ stores/
โ”‚   โ”‚   โ”œโ”€โ”€ sqlite_store.py     # SnippetStore โ†’ SQLite
โ”‚   โ”‚   โ””โ”€โ”€ chroma_store.py     # VectorStore โ†’ ChromaDB
โ”‚   โ”œโ”€โ”€ embedders.py            # Voyage, OpenAI, sentence-transformers
โ”‚   โ”œโ”€โ”€ generators.py           # Claude via Anthropic API
โ”‚   โ”œโ”€โ”€ chunking.py             # Prose/code-aware text splitting
โ”‚   โ”œโ”€โ”€ config.py               # defaults + parsing from file/env
โ”‚   โ”œโ”€โ”€ cli.py                  # CLI commands
โ”‚   โ”œโ”€โ”€ mcp_server.py           # MCP tool definitions
โ”‚   โ””โ”€โ”€ eval/
โ”‚       โ”œโ”€โ”€ models.py           # Eval metrics (precision, recall, MRR)
โ”‚       โ”œโ”€โ”€ runner.py           # Eval runner + parameter sweeps
โ”‚       โ””โ”€โ”€ llm_judge.py        # Haiku-based quality scoring + reranker
โ”œโ”€โ”€ tests/
โ”‚   โ””โ”€โ”€ fixtures/
โ”‚       โ””โ”€โ”€ test_corpus.py      # 30 synthetic snippets + 21 eval queries
โ””โ”€โ”€ docs/
    โ”œโ”€โ”€ architecture.md
    โ””โ”€โ”€ evaluation.md

Further reading

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

domesday-0.1.7.tar.gz (53.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

domesday-0.1.7-py3-none-any.whl (62.7 kB view details)

Uploaded Python 3

File details

Details for the file domesday-0.1.7.tar.gz.

File metadata

  • Download URL: domesday-0.1.7.tar.gz
  • Upload date:
  • Size: 53.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for domesday-0.1.7.tar.gz
Algorithm Hash digest
SHA256 fda1416e440ac11feb1d0c44a3005f09401a72325f68371a977754632f51dbd8
MD5 ccafaf462c263d3f19621734a202ed6a
BLAKE2b-256 33dbd77a963d3e129de106b0adb5adc22ab4df917584b5a3690389c9efe1cab1

See more details on using hashes here.

File details

Details for the file domesday-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: domesday-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 62.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for domesday-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 2ab28944bde6ef07631bea503ee0386fd8c58da57028675bc8baafd850f1885f
MD5 8255b12d9cffe9f7e7c49a30d3c7e79e
BLAKE2b-256 481bf16875fe46dffeb08aa478bf355e26f33a816ae460691d397cd78b8c5a84

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page