Skip to main content

AI sommelier for your wine cellar — ETL pipeline, DuckDB query layer, Markdown dossiers, and MCP server for wine cellar CSV exports

Project description

Cellarbrain

AI sommelier for your wine cellar. Transforms Vinocell CSV exports into normalised Parquet tables, per-wine Markdown dossiers, and an in-process DuckDB query layer that AI agents can use via the Model Context Protocol.

Quick start

# Clone and install (Python 3.11+)
git clone https://github.com/urban-buss/cellarbrain.git
cd cellarbrain
python -m venv .venv

# Windows
.venv\Scripts\activate
# macOS / Linux
source .venv/bin/activate

pip install -e .

Usage

1. Run the ETL pipeline

Export your cellar from cellarbrain (File → Export → CSV) and place the files in raw/:

raw/
├── export-wines.csv
├── export-bottles-stored.csv
└── export-bottles-gone.csv      # optional

Then run:

# Full load (first time)
cellarbrain etl raw/export-wines.csv raw/export-bottles-stored.csv raw/export-bottles-gone.csv -o output

# Incremental sync (subsequent runs — detects changes, preserves IDs)
cellarbrain etl raw/export-wines.csv raw/export-bottles-stored.csv raw/export-bottles-gone.csv -o output --sync

Output goes to output/ — 12 Parquet entity files and per-wine Markdown dossiers under output/wines/.

2. Query your cellar

# SQL query (DuckDB syntax)
cellarbrain query "SELECT w.name, wy.name AS winery, w.vintage FROM wine w JOIN winery wy ON w.winery_id = wy.winery_id LIMIT 10"

# Output as CSV or JSON
cellarbrain query "SELECT * FROM wine LIMIT 5" --format csv
cellarbrain query "SELECT * FROM bottle WHERE status = 'stored'" --format json

# SQL from a file
cellarbrain query -f my_query.sql

3. Cellar statistics

cellarbrain stats                     # Overall summary
cellarbrain stats --by country        # Grouped by country
cellarbrain stats --by grape          # Grouped by grape variety
# Also: region, category, vintage, winery, cellar, provider, status

4. Wine dossiers

cellarbrain dossier 42                # Read dossier for wine #42
cellarbrain dossier --search Barolo   # Search wines by name, grape, region…
cellarbrain dossier --pending         # Wines with pending agent research

5. Validate output

cellarbrain validate                  # Check Parquet integrity

6. Start the MCP server

cellarbrain mcp                       # stdio transport (default)
cellarbrain mcp --transport sse       # SSE transport for HTTP clients

All subcommands accept -d <path> to point at a different data directory (default: output).

Legacy mode: The old cellarbrain <wines.csv> <bottles.csv> syntax still works but emits a deprecation warning.


Web Explorer

A local web dashboard for browsing your cellar, observability data, and running queries interactively.

cellarbrain dashboard          # opens at http://localhost:8017

Pages: overview, tool usage, errors, sessions, latency charts, live tail (SSE), cellar browser, bottles, drinking window, tracked wines, SQL playground, statistics, and workbench. Requires a prior ETL run and MCP log store.


MCP server

The MCP server exposes 7 read/write tools for AI agents (Claude, OpenClaw, Copilot, etc.). Tools are thin data primitives — all reasoning stays in the agent.

Tools

Tool Description
query_cellar Run read-only SQL against the cellar (DuckDB over Parquet)
cellar_stats Summary statistics, optionally grouped by 9 dimensions
find_wine Text search across name, winery, region, grape, vintage
read_dossier Read a wine's full Markdown dossier
update_dossier Write to agent-owned dossier sections (ETL sections protected)
reload_data Re-run the ETL pipeline in-process
pending_research List wines with empty agent sections, sorted by priority

Resources

URI Description
wine://list All wines with basic metadata
wine://cellar Wines currently in the cellar
wine://favorites Favorite wines
wine://{wine_id} Full dossier for a specific wine
cellar://stats Current cellar statistics
cellar://drinking-now Wines in their optimal drinking window
etl://last-run Last ETL run metadata
etl://changes Change log from the last ETL run

Prompts

Prompt Description
cellar_qa System prompt for cellar Q&A (embeds live stats)
food_pairing Food pairing workflow for a given dish
wine_research Deep research workflow for a single wine
batch_research Batch research across pending wines

Configure with Claude Desktop

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "cellarbrain": {
      "command": "cellarbrain",
      "args": ["mcp"],
      "env": {}
    }
  }
}

If cellarbrain is installed in a virtualenv, use the full path:

{
  "mcpServers": {
    "cellarbrain": {
      "command": "/path/to/cellarbrain/.venv/bin/cellarbrain",
      "args": ["mcp"],
      "env": {}
    }
  }
}

To point at a different data directory:

{
  "mcpServers": {
    "cellarbrain": {
      "command": "cellarbrain",
      "args": ["-d", "/path/to/output", "mcp"],
      "env": {}
    }
  }
}

Configure with VS Code (Copilot)

Add to .vscode/mcp.json in your workspace:

{
  "servers": {
    "cellarbrain": {
      "command": "cellarbrain",
      "args": ["mcp"],
      "env": {}
    }
  }
}

Or with a virtualenv on Windows:

{
  "servers": {
    "cellarbrain": {
      "command": ".venv\\Scripts\\cellarbrain.exe",
      "args": ["-d", "output", "mcp"],
      "env": {}
    }
  }
}

Configure with OpenClaw

See .docs/design/openclaw-skill.md for the full skill integration design.


Data model

The ETL produces 12 normalised Parquet tables:

Table Description
wine Central wine catalog with all attributes
bottle Individual bottles (stored + consumed)
winery Producer lookup
appellation Country / region / subregion / classification
grape Grape variety lookup
wine_grape Wine–grape junction with blend percentages
tasting Personal tasting notes and scores
pro_rating Professional critic scores
cellar Physical storage locations
provider Retailers / sources
etl_run Pipeline run history
change_log Row-level insert / update / delete audit trail

Plus per-wine Markdown dossiers in output/wines/ with:

  • ETL-owned sections (identity, origin, inventory, tastings, etc.)
  • Agent-owned sections (producer profile, vintage report, food pairings, etc.)
  • YAML frontmatter tracking which agent sections are populated vs pending

See .docs/data-model/source/ and .docs/data-model/target/ for detailed field documentation.


Project structure

src/cellarbrain/
├── cli.py            # CLI entry point with subcommands
├── reader.py         # CSV readers
├── parsers.py        # Field-level parsers
├── transform.py      # Normalisation and entity building
├── writer.py         # Parquet writer with Arrow schemas
├── validate.py       # Post-ETL validation
├── incremental.py    # Change detection and sync
├── markdown.py       # Dossier generation with agent section preservation
├── query.py          # DuckDB query layer (stats, search, SQL)
├── dossier_ops.py    # Dossier read/write/pending operations
└── mcp_server.py     # FastMCP server (7 tools, 8 resources, 4 prompts)

Development

Note: Unit tests and smoke tests require a source checkout of the repository. They are not included in the PyPI package.

# Clone and set up for development
git clone https://github.com/urban-buss/cellarbrain.git
cd cellarbrain
python -m venv .venv
.venv\Scripts\activate        # Windows
source .venv/bin/activate      # macOS / Linux
pip install -e ".[dev]"

# Run tests
pytest tests/ -v

# Run only unit tests (fast, no CSV files needed)
pytest tests/ -v --ignore=tests/test_integration.py

# Run integration tests (requires raw/*.csv files)
pytest tests/test_integration.py -v

License

Private — not for redistribution.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cellarbrain-0.3.3.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cellarbrain-0.3.3-py3-none-any.whl (868.5 kB view details)

Uploaded Python 3

File details

Details for the file cellarbrain-0.3.3.tar.gz.

File metadata

  • Download URL: cellarbrain-0.3.3.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cellarbrain-0.3.3.tar.gz
Algorithm Hash digest
SHA256 e4607d76322340f8d3db1ee2f05a7f0001bd88fbe27e671445a603b11524c922
MD5 a164fabaea6ab8a9cdfce690888ff7b2
BLAKE2b-256 b981e261ac2e24fa8bfa6dc2c44bf153e991d11d2db49fc8d60a0e7edb718192

See more details on using hashes here.

Provenance

The following attestation bundles were made for cellarbrain-0.3.3.tar.gz:

Publisher: publish.yml on urban-buss/cellarbrain

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cellarbrain-0.3.3-py3-none-any.whl.

File metadata

  • Download URL: cellarbrain-0.3.3-py3-none-any.whl
  • Upload date:
  • Size: 868.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cellarbrain-0.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 97258da026a74dab54330647016da4e88d56f10dfb35823fc0c37364b50f69f6
MD5 af76a058f9793cd96e9321546a8eee03
BLAKE2b-256 1b325a727f89a174f5a510e921c76e8568b07a6597990fae8240424aae21da82

See more details on using hashes here.

Provenance

The following attestation bundles were made for cellarbrain-0.3.3-py3-none-any.whl:

Publisher: publish.yml on urban-buss/cellarbrain

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page