Agentic environmental due-diligence text classification using EnvBert and LangGraph
Project description
EnvBert-Agent
EnvBert-Agent is an agentic AI pipeline for environmental due diligence text classification.
It combines:
- EnvBert (domain-specific transformer backbone)
- Optional LLM fallback
- Agentic routing & confidence arbitration
- Quality control & explainability
- Modular LangGraph orchestration
Features
- Environmental domain classification using EnvBert
- Confidence-based fallback to LLM
- Agentic workflow orchestration via LangGraph
- CLI interface for quick usage
- Python SDK interface for integration
- Designed for due diligence, remediation, and compliance workflows
Requirements
- Python 3.9+
- Transformers (version constrained for TensorFlow compatibility)
- TensorFlow (required by EnvBert backbone)
- Ollama
Ollama Setup
-
Download and install from: https://ollama.com/download
-
Pull a model
ollama pull llama3
- Common Error
ConnectionRefusedError: localhost:11434
It means Ollama is not running.
- Fix:
ollama serve
By default, it runs at:
Keep this running in a separate terminal.
Using Azure OpenAI
Set environment variables:
Windows:
set AZURE_OPENAI_API_KEY=...
set AZURE_OPENAI_ENDPOINT=...
set AZURE_OPENAI_DEPLOYMENT_NAME=...
macOS/Linux:
export AZURE_OPENAI_API_KEY=...
export AZURE_OPENAI_ENDPOINT=...
export AZURE_OPENAI_DEPLOYMENT_NAME=...
Installation
Install from PyPI:
pip install envbert-agent
ollama serve
envbert-agent "BEHP was detected in groundwater.
Python Usage
from envbert_agent import run
text = "BEHP was detected in groundwater"
result = run(text)
print(result)
from envbert_agent.config import LLMConfig
config = LLMConfig(provider="azure")
result = run(
"BEHP was detected in groundwater",
llm_config=config
)
print(result)
CLI Usage
After installation:
envbert-agent "BEHP was detected in groundwater"
envbert-agent "BEHP was detected in groundwater" --provider azure
Example output:
[INPUT]
raw_text: BEHP was the only SVOC detected in groundwater at concentrations exceeding the MCL.
clean_text: BEHP was the only SVOC detected in groundwater at concentrations exceeding the MCL.
language: en
quality_score: 1.0
[CLASSIFICATION]
envbert_label: Remediation Standards
envbert_confidence: 0.437928160341927
route: llm
llm_label: Contaminants
llm_confidence: 0.9
final_label: Contaminants
final_confidence: 0.9
[META]
llm_reasoning: The text mentions a specific contaminant (BEHP) and its concentration exceeding the Maximum Contaminated Level (MCL), indicating the presence of contaminants in groundwater.
decision_trace: LLM fallback
key_phrases: ['behp', 'svoc', 'detected', 'groundwater', 'concentrations', 'exceeding']
[MONITORING]
drift_flag: False
[INPUT]
raw_text: soil and groundwater are both contaminated on the site
clean_text: soil and groundwater are both contaminated on the site
language: en
quality_score: 1.0
[CLASSIFICATION]
envbert_label: Contaminated media
envbert_confidence: 0.7678613004581079
route: accept
llm_label: N/A
llm_confidence: N/A
final_label: Contaminated media
final_confidence: 0.7678613004581079
[META]
llm_reasoning: N/A
decision_trace: EnvBert backbone
key_phrases: ['soil', 'groundwater', 'both', 'contaminated', 'site']
[MONITORING]
drift_flag: False
Direct CLI Invocation from Python
from envbert_agent.cli import main
main(["BEHP was detected in groundwater"])
Architecture: Graph Edges & Flow
START
↓
[preprocess] ────────────────┐
↓ │
[envbert] ────────────────────┤
↓ │
[arbitrate] │ Conditional Router:
├─ (quality < 0.4) │ route = "review"
├─ (confidence ≥ 0.75) ─────┼──────→ route = "accept"
└─ (confidence < 0.75) ─────┘ route = "llm"
↓
┌──────────┬──────────┬──────────┐
↓ ↓ ↓
(review) (accept) (llm)
↓ ↓ ↓
│ ─────────→[llm]←─────
↓
[evaluate]
↓
[explain]
↓
[monitor]
↓
END
License
MIT License
See the LICENSE file for details.
⚠️ Notes
-
This package depends on the EnvBert backbone.
-
Transformers version is constrained for TensorFlow compatibility.
-
Future versions may migrate to a PyTorch backend for improved compatibility and lighter installation footprint.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file envbert_agent-0.2.0.tar.gz.
File metadata
- Download URL: envbert_agent-0.2.0.tar.gz
- Upload date:
- Size: 12.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f3a13b337c2a82b4d5a1fcb3b92b170d5c1de52d2f173cd3460db3eb53efd9b4
|
|
| MD5 |
8ce79313676ffb74b46a9599cf2297c6
|
|
| BLAKE2b-256 |
deec6d1a267ddd861d1c8c4e1e05a0c4ef7f84ff1e90936879266877a3ac948f
|
File details
Details for the file envbert_agent-0.2.0-py3-none-any.whl.
File metadata
- Download URL: envbert_agent-0.2.0-py3-none-any.whl
- Upload date:
- Size: 17.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9b2079e60734807ca9b548d9e5164ecc720d6f6e0c039cda6a86f029ee108b87
|
|
| MD5 |
62bbe9635365fb3a42d129c7284a6cf6
|
|
| BLAKE2b-256 |
1a4c8921dbc7a63b7bd3c36bb402bf36afa7728182087852f77ebad69e10aeab
|