Skip to main content

Agentic environmental due-diligence text classification using EnvBert and LangGraph

Project description

EnvBert-Agent

EnvBert-Agent is an agentic AI pipeline for environmental due diligence text classification.

It combines:

  • EnvBert (domain-specific transformer backbone)
  • Optional LLM fallback
  • Agentic routing & confidence arbitration
  • Quality control & explainability
  • Modular LangGraph orchestration

Features

  • Environmental domain classification using EnvBert
  • Confidence-based fallback to LLM
  • Agentic workflow orchestration via LangGraph
  • CLI interface for quick usage
  • Python SDK interface for integration
  • Designed for due diligence, remediation, and compliance workflows

Requirements

  • Python 3.9+
  • Transformers (version constrained for TensorFlow compatibility)
  • TensorFlow (required by EnvBert backbone)
  • Ollama

Ollama Setup

  1. Download and install from: https://ollama.com/download

  2. Pull a model

ollama pull llama3
  1. Common Error
ConnectionRefusedError: localhost:11434

It means Ollama is not running.

  1. Fix:
ollama serve

By default, it runs at:

http://localhost:11434

Keep this running in a separate terminal.

Using Azure OpenAI

Set environment variables:

Windows:

set AZURE_OPENAI_API_KEY=...
set AZURE_OPENAI_ENDPOINT=...
set AZURE_OPENAI_DEPLOYMENT_NAME=...

macOS/Linux:

export AZURE_OPENAI_API_KEY=...
export AZURE_OPENAI_ENDPOINT=...
export AZURE_OPENAI_DEPLOYMENT_NAME=...

Installation

Install from PyPI:

pip install envbert-agent
ollama serve
envbert-agent "BEHP was detected in groundwater.

Python Usage

from envbert_agent import run

text = "BEHP was detected in groundwater"
result = run(text)

print(result)

from envbert_agent.config import LLMConfig
config = LLMConfig(provider="azure")

result = run(
    "BEHP was detected in groundwater",
    llm_config=config
)

print(result)

CLI Usage

After installation:

envbert-agent "BEHP was detected in groundwater"
envbert-agent "BEHP was detected in groundwater" --provider azure

Example output:

[INPUT]
raw_text: BEHP was the only SVOC detected in groundwater at concentrations exceeding the MCL.
clean_text: BEHP was the only SVOC detected in groundwater at concentrations exceeding the MCL.
language: en
quality_score: 1.0

[CLASSIFICATION]
envbert_label: Remediation Standards
envbert_confidence: 0.437928160341927
route: llm
llm_label: Contaminants
llm_confidence: 0.9
final_label: Contaminants
final_confidence: 0.9

[META]
llm_reasoning: The text mentions a specific contaminant (BEHP) and its concentration exceeding the Maximum Contaminated Level (MCL), indicating the presence of contaminants in groundwater.
decision_trace: LLM fallback
key_phrases: ['behp', 'svoc', 'detected', 'groundwater', 'concentrations', 'exceeding']

[MONITORING]
drift_flag: False
[INPUT]
raw_text: soil and groundwater are both contaminated on the site
clean_text: soil and groundwater are both contaminated on the site
language: en
quality_score: 1.0

[CLASSIFICATION]
envbert_label: Contaminated media
envbert_confidence: 0.7678613004581079
route: accept
llm_label: N/A
llm_confidence: N/A
final_label: Contaminated media
final_confidence: 0.7678613004581079

[META]
llm_reasoning: N/A
decision_trace: EnvBert backbone
key_phrases: ['soil', 'groundwater', 'both', 'contaminated', 'site']

[MONITORING]
drift_flag: False

Direct CLI Invocation from Python

from envbert_agent.cli import main

main(["BEHP was detected in groundwater"])

Architecture: Graph Edges & Flow

START
  ↓
[preprocess] ────────────────┐
  ↓                           │
[envbert] ────────────────────┤
  ↓                           │
[arbitrate]                   │  Conditional Router:
  ├─ (quality < 0.4)          │      route = "review"
  ├─ (confidence ≥ 0.75) ─────┼──────→ route = "accept"
  └─ (confidence < 0.75) ─────┘      route = "llm"
       ↓
    ┌──────────┬──────────┬──────────┐
    ↓          ↓          ↓
  (review)   (accept)   (llm)
    ↓          ↓          ↓
    │ ─────────→[llm]←─────
            ↓
        [evaluate]
            ↓
        [explain]
            ↓
        [monitor]
            ↓
           END

License

MIT License

See the LICENSE file for details.

⚠️ Notes

  • This package depends on the EnvBert backbone.

  • Transformers version is constrained for TensorFlow compatibility.

  • Future versions may migrate to a PyTorch backend for improved compatibility and lighter installation footprint.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

envbert_agent-0.2.0.tar.gz (12.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

envbert_agent-0.2.0-py3-none-any.whl (17.9 kB view details)

Uploaded Python 3

File details

Details for the file envbert_agent-0.2.0.tar.gz.

File metadata

  • Download URL: envbert_agent-0.2.0.tar.gz
  • Upload date:
  • Size: 12.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for envbert_agent-0.2.0.tar.gz
Algorithm Hash digest
SHA256 f3a13b337c2a82b4d5a1fcb3b92b170d5c1de52d2f173cd3460db3eb53efd9b4
MD5 8ce79313676ffb74b46a9599cf2297c6
BLAKE2b-256 deec6d1a267ddd861d1c8c4e1e05a0c4ef7f84ff1e90936879266877a3ac948f

See more details on using hashes here.

File details

Details for the file envbert_agent-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: envbert_agent-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 17.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for envbert_agent-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9b2079e60734807ca9b548d9e5164ecc720d6f6e0c039cda6a86f029ee108b87
MD5 62bbe9635365fb3a42d129c7284a6cf6
BLAKE2b-256 1a4c8921dbc7a63b7bd3c36bb402bf36afa7728182087852f77ebad69e10aeab

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page