MCP server exposing raw envbert EDD classification to Claude, no LLM fallback, no Ollama/Azure dependency
Project description
envbert-mcp
MCP server exposing raw envbert (DistilBERT EDD classification) to Claude — standalone, no LLM fallback, no Ollama/Azure dependency.
Why raw envbert, not envbert-agent
envbert-agent adds an LLM fallback step for low-confidence classifications — useful when there's no other reasoning layer downstream (CLI usage, batch pipelines writing straight to a CSV). But Claude is a reasoning layer already sitting right there. Routing through a second, hidden LLM call inside the tool — adding ~10s and, if configured for Azure, real cost — is redundant when Claude can look at a low-confidence label directly and decide what to do with it.
This server calls envbert.due_diligence.envbert_predict() directly. No
network hop, no Ollama, no Azure, no envbert-api. Sub-second responses.
If you need the LLM-fallback behaviour (e.g. building a non-LLM pipeline),
use envbert-agent or
envbert-api instead — see the cross-comparison below.
Architecture
Claude / Claude Code
│ MCP (stdio)
▼
envbert_mcp/server.py ←── this file, single process
│ in-process call
▼
envbert.due_diligence.envbert_predict()
│
▼
DistilBERT (d4data/environmental-due-diligence-model)
Everything runs in one process. The model loads once at startup (warmup, ~10-40s on a cold HuggingFace cache) and stays in memory for the life of the server.
Tools
| Tool | Purpose |
|---|---|
check_envbert_status |
Confirm the model is loaded before relying on fast responses — the very first call in a session may be slow while warmup is still in progress |
classify_environmental_text |
Classify one sentence/paragraph — label + confidence, no LLM step |
classify_environmental_document |
Classify all paragraphs of a document concurrently, with category distribution and low-confidence flagging |
Why low-confidence flagging matters here
Without an LLM fallback, a low envbert confidence score (e.g. 0.42) is the
final answer — there's no second opinion baked in. classify_environmental_document
surfaces a low_confidence_items list explicitly so Claude can apply its
own judgement to exactly those paragraphs, rather than treating every
result as equally reliable.
{
"category_distribution": {"Geology": 4, "Contaminants": 2},
"low_confidence_count": 1,
"low_confidence_items": [
{"index": 3, "label": "Remediation Standards", "confidence": 0.42}
],
"results": [ ... ]
}
Quickstart
pip install envbert envbert-mcp
Add to ~/.claude/mcp_config.json:
{
"mcpServers": {
"envbert": {
"command": "envbert-mcp"
}
}
}
Restart Claude Code. The model warms up in the background on first
launch — check_envbert_status will report "loading": true until ready.
Example prompts:
- "Is envbert ready?"
- "Classify this: 'weathered shale was encountered below the surface with fluvial deposits'"
- "Here's a 12-paragraph site report — classify each section and flag anything you're not confident about."
envbert-mcp vs envbert-agent / envbert-api — which to use
| This package (raw envbert) | envbert-agent / envbert-api | |
|---|---|---|
| Used by | Claude / MCP clients | CLI, pipelines, non-LLM consumers |
| LLM fallback | None | Yes — Ollama (local) or Azure |
| Typical latency | <1s always | <1s confident, ~10s on fallback |
| External dependencies | None beyond envbert | Ollama or Azure OpenAI |
| Confidence on ambiguous text | Raw model score only | LLM-resolved final label |
| Why this shape | Claude can reason over raw scores itself — a second hidden LLM call is redundant | No reasoning layer downstream; the agent must resolve ambiguity itself |
Both are legitimate — they're solving for different consumers of the classification, not competing implementations of the same thing.
Configuration
| Variable | Default | Description |
|---|---|---|
LOG_LEVEL |
INFO |
Logging verbosity |
No other configuration needed — there's no backend URL, no LLM provider, no API keys. That's the point.
Development
pip install -e ".[dev]"
pytest tests/ -v
Tests mock envbert_predict() directly — no real model download needed
to run the suite.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file envbert_mcp-3.0.0.tar.gz.
File metadata
- Download URL: envbert_mcp-3.0.0.tar.gz
- Upload date:
- Size: 8.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4fd3fddf808e7d1730bcdc5920d0810eb94ef0ca5ee654fbbf25c801f18b6b39
|
|
| MD5 |
bacf7c5167400c30d92d9cdf3cb2e65d
|
|
| BLAKE2b-256 |
7e7aaa86aeab2fc88ce670e0f44da56d57fee2802a54db4f34939b69b25e7ba0
|
File details
Details for the file envbert_mcp-3.0.0-py3-none-any.whl.
File metadata
- Download URL: envbert_mcp-3.0.0-py3-none-any.whl
- Upload date:
- Size: 7.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f63f1908d16e47ddad0c7e3456f35b1c7ebbe7ea86dcb98f00852ae40786d084
|
|
| MD5 |
24560e9ec55045cbbcbd05d14a1fee5b
|
|
| BLAKE2b-256 |
c66983950a201797a8b3eb0c628de9ebf55183c53103f5a8be0f49afcefe657a
|