AI sommelier for your wine cellar — ETL pipeline, DuckDB query layer, Markdown dossiers, and MCP server for wine cellar CSV exports
Project description
Cellarbrain
AI sommelier for your wine cellar. Transforms Vinocell CSV exports into normalised Parquet tables, per-wine Markdown dossiers, and an in-process DuckDB query layer that AI agents can use via the Model Context Protocol.
Quick start
# Clone and install (Python 3.11+)
git clone https://github.com/urban-buss/cellarbrain.git
cd cellarbrain
python -m venv .venv
# Windows
.venv\Scripts\activate
# macOS / Linux
source .venv/bin/activate
pip install -e .
Usage
1. Run the ETL pipeline
Export your cellar from cellarbrain (File → Export → CSV) and place the files
in raw/:
raw/
├── export-wines.csv
├── export-bottles-stored.csv
└── export-bottles-gone.csv # optional
Then run:
# Full load (first time)
cellarbrain etl raw/export-wines.csv raw/export-bottles-stored.csv raw/export-bottles-gone.csv -o output
# Incremental sync (subsequent runs — detects changes, preserves IDs)
cellarbrain etl raw/export-wines.csv raw/export-bottles-stored.csv raw/export-bottles-gone.csv -o output --sync
Output goes to output/ — 12 Parquet entity files and per-wine Markdown
dossiers under output/wines/.
2. Query your cellar
# SQL query (DuckDB syntax)
cellarbrain query "SELECT w.name, wy.name AS winery, w.vintage FROM wine w JOIN winery wy ON w.winery_id = wy.winery_id LIMIT 10"
# Output as CSV or JSON
cellarbrain query "SELECT * FROM wine LIMIT 5" --format csv
cellarbrain query "SELECT * FROM bottle WHERE status = 'stored'" --format json
# SQL from a file
cellarbrain query -f my_query.sql
3. Cellar statistics
cellarbrain stats # Overall summary
cellarbrain stats --by country # Grouped by country
cellarbrain stats --by grape # Grouped by grape variety
# Also: region, category, vintage, winery, cellar, provider, status
4. Wine dossiers
cellarbrain dossier 42 # Read dossier for wine #42
cellarbrain dossier --search Barolo # Search wines by name, grape, region…
cellarbrain dossier --pending # Wines with pending agent research
5. Validate output
cellarbrain validate # Check Parquet integrity
6. Start the MCP server
cellarbrain mcp # stdio transport (default)
cellarbrain mcp --transport sse # SSE transport for HTTP clients
All subcommands accept -d <path> to point at a different data directory
(default: output).
Legacy mode: The old
cellarbrain <wines.csv> <bottles.csv>syntax still works but emits a deprecation warning.
Web Explorer
A local web dashboard for browsing your cellar, observability data, and running queries interactively.
cellarbrain dashboard # opens at http://localhost:8017
Pages: overview, tool usage, errors, sessions, latency charts, live tail (SSE), cellar browser, bottles, drinking window, tracked wines, SQL playground, statistics, and workbench. Requires a prior ETL run and MCP log store.
MCP server
The MCP server exposes 7 read/write tools for AI agents (Claude, OpenClaw, Copilot, etc.). Tools are thin data primitives — all reasoning stays in the agent.
Tools
| Tool | Description |
|---|---|
query_cellar |
Run read-only SQL against the cellar (DuckDB over Parquet) |
cellar_stats |
Summary statistics, optionally grouped by 9 dimensions |
find_wine |
Text search across name, winery, region, grape, vintage |
read_dossier |
Read a wine's full Markdown dossier |
update_dossier |
Write to agent-owned dossier sections (ETL sections protected) |
reload_data |
Re-run the ETL pipeline in-process |
pending_research |
List wines with empty agent sections, sorted by priority |
Resources
| URI | Description |
|---|---|
wine://list |
All wines with basic metadata |
wine://cellar |
Wines currently in the cellar |
wine://favorites |
Favorite wines |
wine://{wine_id} |
Full dossier for a specific wine |
cellar://stats |
Current cellar statistics |
cellar://drinking-now |
Wines in their optimal drinking window |
etl://last-run |
Last ETL run metadata |
etl://changes |
Change log from the last ETL run |
Prompts
| Prompt | Description |
|---|---|
cellar_qa |
System prompt for cellar Q&A (embeds live stats) |
food_pairing |
Food pairing workflow for a given dish |
wine_research |
Deep research workflow for a single wine |
batch_research |
Batch research across pending wines |
Configure with Claude Desktop
Add to claude_desktop_config.json:
{
"mcpServers": {
"cellarbrain": {
"command": "cellarbrain",
"args": ["mcp"],
"env": {}
}
}
}
If cellarbrain is installed in a virtualenv, use the full path:
{
"mcpServers": {
"cellarbrain": {
"command": "/path/to/cellarbrain/.venv/bin/cellarbrain",
"args": ["mcp"],
"env": {}
}
}
}
To point at a different data directory:
{
"mcpServers": {
"cellarbrain": {
"command": "cellarbrain",
"args": ["-d", "/path/to/output", "mcp"],
"env": {}
}
}
}
Configure with VS Code (Copilot)
Add to .vscode/mcp.json in your workspace:
{
"servers": {
"cellarbrain": {
"command": "cellarbrain",
"args": ["mcp"],
"env": {}
}
}
}
Or with a virtualenv on Windows:
{
"servers": {
"cellarbrain": {
"command": ".venv\\Scripts\\cellarbrain.exe",
"args": ["-d", "output", "mcp"],
"env": {}
}
}
}
Configure with OpenClaw
See .docs/design/openclaw-skill.md for
the full skill integration design.
Data model
The ETL produces 12 normalised Parquet tables:
| Table | Description |
|---|---|
wine |
Central wine catalog with all attributes |
bottle |
Individual bottles (stored + consumed) |
winery |
Producer lookup |
appellation |
Country / region / subregion / classification |
grape |
Grape variety lookup |
wine_grape |
Wine–grape junction with blend percentages |
tasting |
Personal tasting notes and scores |
pro_rating |
Professional critic scores |
cellar |
Physical storage locations |
provider |
Retailers / sources |
etl_run |
Pipeline run history |
change_log |
Row-level insert / update / delete audit trail |
Plus per-wine Markdown dossiers in output/wines/ with:
- ETL-owned sections (identity, origin, inventory, tastings, etc.)
- Agent-owned sections (producer profile, vintage report, food pairings, etc.)
- YAML frontmatter tracking which agent sections are populated vs pending
See .docs/data-model/source/ and
.docs/data-model/target/ for detailed field documentation.
Project structure
src/cellarbrain/
├── cli.py # CLI entry point with subcommands
├── reader.py # CSV readers
├── parsers.py # Field-level parsers
├── transform.py # Normalisation and entity building
├── writer.py # Parquet writer with Arrow schemas
├── validate.py # Post-ETL validation
├── incremental.py # Change detection and sync
├── markdown.py # Dossier generation with agent section preservation
├── query.py # DuckDB query layer (stats, search, SQL)
├── dossier_ops.py # Dossier read/write/pending operations
└── mcp_server.py # FastMCP server (7 tools, 8 resources, 4 prompts)
Development
Note: Unit tests and smoke tests require a source checkout of the repository. They are not included in the PyPI package.
# Clone and set up for development
git clone https://github.com/urban-buss/cellarbrain.git
cd cellarbrain
python -m venv .venv
.venv\Scripts\activate # Windows
source .venv/bin/activate # macOS / Linux
pip install -e ".[dev]"
# Run tests
pytest tests/ -v
# Run only unit tests (fast, no CSV files needed)
pytest tests/ -v --ignore=tests/test_integration.py
# Run integration tests (requires raw/*.csv files)
pytest tests/test_integration.py -v
License
Private — not for redistribution.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cellarbrain-0.3.3.tar.gz.
File metadata
- Download URL: cellarbrain-0.3.3.tar.gz
- Upload date:
- Size: 1.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e4607d76322340f8d3db1ee2f05a7f0001bd88fbe27e671445a603b11524c922
|
|
| MD5 |
a164fabaea6ab8a9cdfce690888ff7b2
|
|
| BLAKE2b-256 |
b981e261ac2e24fa8bfa6dc2c44bf153e991d11d2db49fc8d60a0e7edb718192
|
Provenance
The following attestation bundles were made for cellarbrain-0.3.3.tar.gz:
Publisher:
publish.yml on urban-buss/cellarbrain
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cellarbrain-0.3.3.tar.gz -
Subject digest:
e4607d76322340f8d3db1ee2f05a7f0001bd88fbe27e671445a603b11524c922 - Sigstore transparency entry: 1591993704
- Sigstore integration time:
-
Permalink:
urban-buss/cellarbrain@823e3dc3cd4b9e6a0be400c969ddef138a94fa6a -
Branch / Tag:
refs/tags/v0.3.3 - Owner: https://github.com/urban-buss
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@823e3dc3cd4b9e6a0be400c969ddef138a94fa6a -
Trigger Event:
push
-
Statement type:
File details
Details for the file cellarbrain-0.3.3-py3-none-any.whl.
File metadata
- Download URL: cellarbrain-0.3.3-py3-none-any.whl
- Upload date:
- Size: 868.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
97258da026a74dab54330647016da4e88d56f10dfb35823fc0c37364b50f69f6
|
|
| MD5 |
af76a058f9793cd96e9321546a8eee03
|
|
| BLAKE2b-256 |
1b325a727f89a174f5a510e921c76e8568b07a6597990fae8240424aae21da82
|
Provenance
The following attestation bundles were made for cellarbrain-0.3.3-py3-none-any.whl:
Publisher:
publish.yml on urban-buss/cellarbrain
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cellarbrain-0.3.3-py3-none-any.whl -
Subject digest:
97258da026a74dab54330647016da4e88d56f10dfb35823fc0c37364b50f69f6 - Sigstore transparency entry: 1591993712
- Sigstore integration time:
-
Permalink:
urban-buss/cellarbrain@823e3dc3cd4b9e6a0be400c969ddef138a94fa6a -
Branch / Tag:
refs/tags/v0.3.3 - Owner: https://github.com/urban-buss
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@823e3dc3cd4b9e6a0be400c969ddef138a94fa6a -
Trigger Event:
push
-
Statement type: