Skip to main content

A shared knowledge base that keeps AI tools informed of your team's project-specific information.

Project description

domesday-book

A shared knowledge base that keeps AI tools informed of your team's project-specific information.

PyPI Python version

Coverage ty CI/CD GitHub issues

The natives call this book "Domesday" ... concerning the matters contained in the book, its word cannot be denied or set aside.
wikipedia.org/wiki/Domesday_Book

Status

๐Ÿšง Ongoing development! ๐Ÿšง

Core functionality implemented as a working protoype. See Roadmap for next steps.

Why this exists

Research teams accumulate critical tacit knowledge โ€” processing caveats, data access optimizations, troubleshooting tips โ€” that lives in Teams conversations, scattered notes, and people's heads. This is information AI tools (and other people) need access to.

We need a system where:

  • Adding knowledge is as easy as pasting a text snippet into a box
  • The system automatically processes new entries
  • Multiple team members can contribute and curate entries
  • The knowledge base is queryable by AI tools like Claude, giving answers with citations to the original snippets

Quickstart

# Install
uv tool install domesday[voyage,mcp]

# Set API keys
export VOYAGE_API_KEY=voy-...
export ANTHROPIC_API_KEY=sk-ant-...

# Add a snippet to a project
domes -p vbo add "The VBO dataset has an off-by-one error in timestamps before 2023-06-01."

# Bulk ingest a folder
domes -p vbo ingest ./project-notes/ --author ben

# Semantic search within a project to find matching snippets (retrieval only)
domes -p vbo search "VBO timestamp issues"

# Ask a question (retrieve matching snippets โ†’ LLM generates answer with citations)
domes -p vbo ask "What are the known caveats with the VBO dataset?"

# Actual answer from Claude Sonnet 4.6
# **VBO Dataset Timestamp Error**: The VBO dataset has an **off-by-one error in timestamps** for any data dated **before 2023-06-01**. [snippet-1fffb1]  

# Search across all projects
domes search "timestamp bugs" --all-projects

# Browse and inspect
domes projects             # list all projects with snippet counts
domes -p vbo list          # recent snippets in a project
domes stats --all-projects # stats across everything

Development

# clone repo and install:
uv sync --all-extras

How it works

Add snippet (paste/CLI/MCP)
  โ†’ Store raw text + metadata (SQLite)
  โ†’ Chunk (prose/code-aware, ~400 tokens)
  โ†’ Embed (Voyage / OpenAI / local model)
  โ†’ Index (ChromaDB vector store)

Ask a question (CLI/MCP/API)
  โ†’ Embed query
  โ†’ Vector similarity search (cosine, with score threshold)
  โ†’ [Optional] LLM reranker filters irrelevant results
  โ†’ Format context with author, date, tags
  โ†’ Generate answer via Claude with inline citations

Every backend is behind a Protocol interface โ€” swap storage, embedding, or generation by changing config. See Architecture for details.

Projects

A single domesday instance can hold multiple projects. Each snippet belongs to exactly one project. Queries are scoped to a project by default, preventing cross-contamination between unrelated knowledge bases.

# Set a default project in config
# domesday.toml: default_project = "vbo"

# Or specify per-command (--project / -p goes before the subcommand)
domes -p vbo add "some caveat"
domes -p ephys-rig add "different caveat"

# Search within a project
domes -p vbo search "timing issues"

# Search across everything
domes search "timing issues" --all-projects

# See what projects exist
domes projects

# Rename a project
domes rename-project old-name new-name

The --project flag (or -p) can also be set at the top level, applying to all subcommands:

domes -p vbo add "some caveat"
domes -p vbo search "timing"
domes -p vbo ask "what are the known issues?"

For MCP, pass the project in tool arguments, or set DOMESDAY_DEFAULT_PROJECT in the server environment.

Configuration

Place domesday.toml in your project root:

data_dir = "./data"
default_project = "main"      # used when --project is not specified

[embedder]
backend = "voyage"             # voyage | openai | local
model = "voyage-4-large"

[generator]
backend = "claude"
model = "claude-sonnet-4-6"

[chunker]
max_tokens = 400
overlap_tokens = 50

[retrieval]
min_score = 0.3               # cosine similarity threshold

[reranker]
enabled = false               # LLM-based relevance filtering (adds latency)
model = "claude-haiku-4-5"
relevance_threshold = 0.5

Environment variables override config: DOMESDAY_DATA_DIR, DOMESDAY_EMBEDDER_BACKEND, DOMESDAY_EMBEDDER_MODEL, DOMESDAY_GENERATOR_MODEL.

CLI reference

All commands accept --project / -p to scope to a specific project. This can also be set at the top level: domes -p myproject <command>.

Use --verbose / -v for INFO-level logs or --debug / -d for full DEBUG output:

domes -v search "timestamp issues"     # see search flow
domes -d ingest ./notes/               # see every chunk and embedding call
Command Description
domes add "text" Add a snippet (also accepts --file, stdin, or opens $EDITOR)
domes add --author ben --tags "vbo,bug" Add with metadata
domes -p myproject ingest ./folder/ Bulk ingest files into a project
domes search "query" Semantic search within the current project
domes search "query" --all-projects Search across all projects
domes ask "question" Retrieve relevant snippets then generate an answer with citations
domes ask "question" --show-sources Also print which snippets were used
domes list Show recent snippets in current project
domes list --all-projects Show recent snippets across all projects
domes projects List all projects with snippet counts
domes rename-project old new Rename a project across all stores
domes stats Show stats for current project
domes stats --all-projects Show stats across all projects

MCP integration

domesday exposes itself as an MCP server, making the knowledge base available from Claude Desktop, Cursor, VS Code, or any MCP-compatible client.

Local (stdio) โ€” add to claude_desktop_config.json:

{
  "mcpServers": {
    "domesday": {
      "command": "python",
      "args": ["-m", "domesday.mcp_server"],
      "env": {
        "DOMESDAY_DATA_DIR": "/absolute/path/to/data",
        "DOMESDAY_DEFAULT_PROJECT": "vbo",
        "VOYAGE_API_KEY": "voy-...",
        "ANTHROPIC_API_KEY": "sk-ant-..."
      }
    }
  }
}

Remote (SSE) โ€” for team access:

{
  "mcpServers": {
    "domesday": {
      "url": "https://your-server.internal:8080/mcp/sse"
    }
  }
}

Available MCP tools:

Tool Description
search_knowledge(query, project?, n_results?, tags?) Semantic search over snippets
add_snippet(text, project?, author?, tags?) Add new knowledge from any client
get_snippet(snippet_id) Retrieve a snippet by full or short (8-char) ID
list_recent(n?, project?, author?) Browse recent additions
list_projects() List all projects with snippet counts
rename_project(old_name, new_name) Rename a project across all stores
ask(question, project?, n_context?) Retrieve relevant context and generate an answer with citations

All tools accept an optional project parameter. Pass "__all__" to search across all projects.

Evaluation

domesday includes an evaluation framework for measuring retrieval quality and generation faithfulness. See Evaluation for full details.

# Run retrieval eval against test corpus
python -m domesday.eval.runner

# Also judge generation quality with Haiku
python -m domesday.eval.runner --judge

# Parameter sweep (min_score, k, chunk size, overlap)
python -m domesday.eval.runner --sweep --quick

# Interactive: inspect individual queries and results
python -m domesday.eval.runner -i

Project structure

domesday/
โ”œโ”€โ”€ pyproject.toml
โ”œโ”€โ”€ domesday.toml
โ”œโ”€โ”€ domesday/
โ”‚   โ”œโ”€โ”€ core/
โ”‚   โ”‚   โ”œโ”€โ”€ models.py           # Snippet, Chunk, SearchResult, RAGResponse
โ”‚   โ”‚   โ”œโ”€โ”€ protocols.py        # Swappable interfaces for all backends
โ”‚   โ”‚   โ””โ”€โ”€ pipeline.py         # Orchestrator: add, ingest, search, ask
โ”‚   โ”œโ”€โ”€ stores/
โ”‚   โ”‚   โ”œโ”€โ”€ sqlite_store.py     # DocumentStore โ†’ SQLite
โ”‚   โ”‚   โ””โ”€โ”€ chroma_store.py     # VectorStore โ†’ ChromaDB
โ”‚   โ”œโ”€โ”€ embedders.py            # Voyage, OpenAI, sentence-transformers
โ”‚   โ”œโ”€โ”€ generators.py           # Claude via Anthropic API
โ”‚   โ”œโ”€โ”€ chunking.py             # Prose/code-aware text splitting
โ”‚   โ”œโ”€โ”€ config.py               # defaults + parsing from file/env
โ”‚   โ”œโ”€โ”€ cli.py                  # CLI commands
โ”‚   โ”œโ”€โ”€ mcp_server.py           # MCP tool definitions
โ”‚   โ””โ”€โ”€ eval/
โ”‚       โ”œโ”€โ”€ models.py           # Eval metrics (precision, recall, MRR)
โ”‚       โ”œโ”€โ”€ runner.py           # Eval runner + parameter sweeps
โ”‚       โ””โ”€โ”€ llm_judge.py        # Haiku-based quality scoring + reranker
โ”œโ”€โ”€ tests/
โ”‚   โ””โ”€โ”€ fixtures/
โ”‚       โ””โ”€โ”€ test_corpus.py      # 30 synthetic snippets + 21 eval queries
โ””โ”€โ”€ docs/
    โ”œโ”€โ”€ architecture.md
    โ””โ”€โ”€ evaluation.md

Further reading

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

domesday-0.1.4.tar.gz (35.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

domesday-0.1.4-py3-none-any.whl (43.4 kB view details)

Uploaded Python 3

File details

Details for the file domesday-0.1.4.tar.gz.

File metadata

  • Download URL: domesday-0.1.4.tar.gz
  • Upload date:
  • Size: 35.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for domesday-0.1.4.tar.gz
Algorithm Hash digest
SHA256 b5bbed9b412c65e050e89a5b8f0e8dbd7f2006e05553f47296ac5c0ef149047d
MD5 14974c66cf9e273c4320ca9f09a0884d
BLAKE2b-256 0a5ff20f452258f73782232342272905fc0d121a0ff97114cf7d39a5928456d6

See more details on using hashes here.

File details

Details for the file domesday-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: domesday-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 43.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for domesday-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 9286806a034e4c49fabf77a356f4199b292367a123090268a31a4b0302a4598a
MD5 9243983c3dfe831755488d90dae82afa
BLAKE2b-256 a7b0a2cd3af2b1f126b539216ee49ae1b3270f9bd24970f5967e40ce8d6e8358

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page