A shared knowledge base that keeps AI tools informed of your team's project-specific information.
Project description
domesday-book
A shared knowledge base that keeps AI tools informed of your team's project-specific information.
The natives call this book "Domesday" ... concerning the matters contained in the book, its word cannot be denied or set aside.
wikipedia.org/wiki/Domesday_Book
Status
๐ง Ongoing development! ๐ง
Core functionality implemented as a working protoype. See Roadmap for next steps.
Why this exists
Research teams accumulate critical tacit knowledge โ processing caveats, data access optimizations, troubleshooting tips โ that lives in Teams conversations, scattered notes, and people's heads. This is information AI tools (and other people) need access to.
We need a system where:
- Adding knowledge is as easy as pasting a text snippet into a box
- The system automatically processes new entries
- Multiple team members can contribute and curate entries
- The knowledge base is queryable by AI tools like Claude, giving answers with citations to the original snippets
Quickstart
# Install
uv tool install domesday[voyage,mcp]
# Set API keys
export VOYAGE_API_KEY=voy-...
export ANTHROPIC_API_KEY=sk-ant-...
# Add a snippet to a project
domes -p vbo add "The VBO dataset has an off-by-one error in timestamps before 2023-06-01."
# Bulk ingest a folder
domes -p vbo ingest ./project-notes/ --author ben
# Semantic search within a project to find matching snippets (retrieval only)
domes -p vbo search "VBO timestamp issues"
# Ask a question (retrieve matching snippets โ LLM generates answer with citations)
domes -p vbo ask "What are the known caveats with the dataset?"
# Actual answer from Claude Sonnet 4.6
# **VBO Dataset Timestamp Error**: The VBO dataset has an **off-by-one error in timestamps** for any data dated **before 2023-06-01**. [snippet-1fffb1]
# Search across all projects
domes search "timestamp bugs" --all-projects
# Browse and inspect
domes projects # list all projects with snippet counts
domes -p vbo list # recent snippets in a project
domes stats --all-projects # stats across everything
Development
# clone repo and install:
uv sync --all-extras
How it works
Add snippet (paste/CLI/MCP)
โ Store raw text + metadata (SQLite)
โ Chunk (prose/code-aware, ~400 tokens)
โ Embed (Voyage / OpenAI / local model)
โ Index (ChromaDB vector store)
Ask a question (CLI/MCP/API)
โ Embed query
โ Vector similarity search (cosine, with score threshold)
โ [Optional] LLM reranker filters irrelevant results
โ Format context with author, date, tags
โ Generate answer via Claude with inline citations
Every backend is behind a Protocol interface โ swap storage, embedding, or generation by changing config. See Architecture for details.
Projects
A single domesday instance can hold multiple projects. Each snippet belongs to exactly one project. Queries are scoped to a project by default, preventing cross-contamination between unrelated knowledge bases.
# Set a default project in config
# domesday.toml: default_project = "vbo"
# Or specify per-command (--project / -p goes before the subcommand)
domes -p vbo add "some caveat"
domes -p ephys-rig add "different caveat"
# Search within a project
domes -p vbo search "timing issues"
# Search across everything
domes search "timing issues" --all-projects
# See what projects exist
domes projects
# Rename a project
domes rename-project old-name new-name
The --project flag (or -p) can also be set at the top level, applying to all subcommands:
domes -p vbo add "some caveat"
domes -p vbo search "timing"
domes -p vbo ask "what are the known issues?"
For MCP, pass the project in tool arguments, or set DOMESDAY_DEFAULT_PROJECT in the server environment.
Configuration
Place domesday.toml in your project root:
data_dir = "./data"
default_project = "main" # used when --project is not specified
[embedder]
backend = "voyage" # voyage | openai | local
model = "voyage-4-large"
[generator]
backend = "claude"
model = "claude-sonnet-4-6"
[chunker]
max_tokens = 400
overlap_tokens = 50
[retrieval]
min_score = 0.3 # cosine similarity threshold
[reranker]
enabled = false # LLM-based relevance filtering (adds latency)
model = "claude-haiku-4-5"
relevance_threshold = 0.5
Environment variables override config: DOMESDAY_DATA_DIR, DOMESDAY_EMBEDDER_BACKEND, DOMESDAY_EMBEDDER_MODEL, DOMESDAY_GENERATOR_MODEL.
CLI reference
All commands accept --project / -p to scope to a specific project. This can also be set at the top level: domes -p myproject <command>.
Use --verbose / -v for INFO-level logs or --debug / -d for full DEBUG output:
domes -v search "timestamp issues" # see search flow
domes -d ingest ./notes/ # see every chunk and embedding call
| Command | Description |
|---|---|
domes add "text" |
Add a snippet (also accepts --file, stdin, or opens $EDITOR) |
domes add --author ben --tags "vbo,bug" |
Add with metadata |
domes -p myproject ingest ./folder/ |
Bulk ingest files into a project |
domes search "query" |
Semantic search within the current project |
domes search "query" --all-projects |
Search across all projects |
domes ask "question" |
Retrieve relevant snippets then generate an answer with citations |
domes ask "question" --show-sources |
Also print which snippets were used |
domes list |
Show recent snippets in current project |
domes list --all-projects |
Show recent snippets across all projects |
domes projects |
List all projects with snippet counts |
domes rename-project old new |
Rename a project across all stores |
domes stats |
Show stats for current project |
domes stats --all-projects |
Show stats across all projects |
MCP integration
domesday exposes itself as an MCP server, making the knowledge base available from Claude Desktop, Cursor, VS Code, or any MCP-compatible client.
Local (stdio) โ add to claude_desktop_config.json:
{
"mcpServers": {
"domesday": {
"command": "python",
"args": ["-m", "domesday.mcp_server"],
"env": {
"DOMESDAY_DATA_DIR": "/absolute/path/to/data",
"DOMESDAY_DEFAULT_PROJECT": "vbo",
"VOYAGE_API_KEY": "voy-...",
"ANTHROPIC_API_KEY": "sk-ant-..."
}
}
}
}
Remote (SSE) โ for team access:
{
"mcpServers": {
"domesday": {
"url": "https://your-server.internal:8080/mcp/sse"
}
}
}
Available MCP tools:
| Tool | Description |
|---|---|
search_knowledge(query, project?, n_results?, tags?) |
Semantic search over snippets |
add_snippet(text, project?, author?, tags?) |
Add new knowledge from any client |
get_snippet(snippet_id) |
Retrieve a snippet by full or short (8-char) ID |
list_recent(n?, project?, author?) |
Browse recent additions |
list_projects() |
List all projects with snippet counts |
rename_project(old_name, new_name) |
Rename a project across all stores |
ask(question, project?, n_context?) |
Retrieve relevant context and generate an answer with citations |
All tools accept an optional project parameter. Pass "all" to search across all projects.
Evaluation
domesday includes an evaluation framework for measuring retrieval quality and generation faithfulness. See Evaluation for full details.
# Run retrieval eval against test corpus
python -m domesday.eval.runner
# Also judge generation quality with Haiku
python -m domesday.eval.runner --judge
# Parameter sweep (min_score, k, chunk size, overlap)
python -m domesday.eval.runner --sweep --quick
# Interactive: inspect individual queries and results
python -m domesday.eval.runner -i
Project structure
domesday/
โโโ pyproject.toml
โโโ domesday.toml
โโโ domesday/
โ โโโ core/
โ โ โโโ models.py # Snippet, Chunk, SearchResult, RAGResponse
โ โ โโโ protocols.py # Swappable interfaces for all backends
โ โ โโโ pipeline.py # Orchestrator: add, ingest, search, ask
โ โโโ stores/
โ โ โโโ sqlite_store.py # DocumentStore โ SQLite
โ โ โโโ chroma_store.py # VectorStore โ ChromaDB
โ โโโ embedders.py # Voyage, OpenAI, sentence-transformers
โ โโโ generators.py # Claude via Anthropic API
โ โโโ chunking.py # Prose/code-aware text splitting
โ โโโ config.py # defaults + parsing from file/env
โ โโโ cli.py # CLI commands
โ โโโ mcp_server.py # MCP tool definitions
โ โโโ eval/
โ โโโ models.py # Eval metrics (precision, recall, MRR)
โ โโโ runner.py # Eval runner + parameter sweeps
โ โโโ llm_judge.py # Haiku-based quality scoring + reranker
โโโ tests/
โ โโโ fixtures/
โ โโโ test_corpus.py # 30 synthetic snippets + 21 eval queries
โโโ docs/
โโโ architecture.md
โโโ evaluation.md
Further reading
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file domesday-0.1.5.tar.gz.
File metadata
- Download URL: domesday-0.1.5.tar.gz
- Upload date:
- Size: 35.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2609dad686f812e6f8a079be89dfd513db3762b7890f4835edeb9c067624db5c
|
|
| MD5 |
2a62cc3c4c9c0ea346aeabd3ee6621d5
|
|
| BLAKE2b-256 |
117a9c645a146fddd6677576acbcf85b26d5dce3b234f6156e17cfc166c87223
|
File details
Details for the file domesday-0.1.5-py3-none-any.whl.
File metadata
- Download URL: domesday-0.1.5-py3-none-any.whl
- Upload date:
- Size: 43.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c1dd6c3c7b781c9f6883f0332a755d26ec5c822702e5aeb90bfabd601c7a1603
|
|
| MD5 |
586914970c2097ffff04d0a9f894e98e
|
|
| BLAKE2b-256 |
cd29bc21bbf7cbe183dac34efbdb5544a485bc7e7f20440b940b101b3249e74a
|