Skip to main content

Agentic environmental due-diligence text classification using EnvBert and LangGraph

Project description

EnvBert-Agent

EnvBert-Agent is an agentic AI pipeline for environmental due diligence text classification.

It combines:

  • EnvBert (domain-specific transformer backbone)
  • Optional LLM fallback
  • Agentic routing & confidence arbitration
  • Quality control & explainability
  • Modular LangGraph orchestration

Features

  • Environmental domain classification using EnvBert
  • Confidence-based fallback to LLM
  • Agentic workflow orchestration via LangGraph
  • CLI interface for quick usage
  • Python SDK interface for integration
  • Designed for due diligence, remediation, and compliance workflows

Requirements

  • Python 3.9+
  • Transformers (version constrained for TensorFlow compatibility)
  • TensorFlow (required by EnvBert backbone)
  • Ollama

Ollama Setup

  1. Download and install from: https://ollama.com/download

  2. Pull a model

ollama pull llama3
  1. Common Error
ConnectionRefusedError: localhost:11434

It means Ollama is not running.

  1. Fix:
ollama serve

By default, it runs at:

http://localhost:11434

Keep this running in a separate terminal.

Using Azure OpenAI

Set environment variables:

Windows:

set AZURE_OPENAI_API_KEY=...
set AZURE_OPENAI_ENDPOINT=...
set AZURE_OPENAI_DEPLOYMENT_NAME=...

macOS/Linux:

export AZURE_OPENAI_API_KEY=...
export AZURE_OPENAI_ENDPOINT=...
export AZURE_OPENAI_DEPLOYMENT_NAME=...

Installation

Install from PyPI:

pip install envbert-agent
ollama serve
envbert-agent "BEHP was detected in groundwater.

Python Usage

from envbert_agent import run

text = "BEHP was detected in groundwater"
result = run(text)

print(result)

from envbert_agent.config import LLMConfig
config = LLMConfig(provider="azure")

result = run(
    "BEHP was detected in groundwater",
    llm_config=config
)

print(result)

CLI Usage

After installation:

envbert-agent "BEHP was detected in groundwater"
envbert-agent "BEHP was detected in groundwater" --provider azure

Example output:

[INPUT]
raw_text: BEHP was the only SVOC detected in groundwater at concentrations exceeding the MCL.
clean_text: BEHP was the only SVOC detected in groundwater at concentrations exceeding the MCL.
language: en
quality_score: 1.0

[CLASSIFICATION]
envbert_label: Remediation Standards
envbert_confidence: 0.437928160341927
route: llm
llm_label: Contaminants
llm_confidence: 0.9
final_label: Contaminants
final_confidence: 0.9

[META]
llm_reasoning: The text mentions a specific contaminant (BEHP) and its concentration exceeding the Maximum Contaminated Level (MCL), indicating the presence of contaminants in groundwater.
decision_trace: LLM fallback
key_phrases: ['behp', 'svoc', 'detected', 'groundwater', 'concentrations', 'exceeding']

[MONITORING]
drift_flag: False
[INPUT]
raw_text: soil and groundwater are both contaminated on the site
clean_text: soil and groundwater are both contaminated on the site
language: en
quality_score: 1.0

[CLASSIFICATION]
envbert_label: Contaminated media
envbert_confidence: 0.7678613004581079
route: accept
llm_label: N/A
llm_confidence: N/A
final_label: Contaminated media
final_confidence: 0.7678613004581079

[META]
llm_reasoning: N/A
decision_trace: EnvBert backbone
key_phrases: ['soil', 'groundwater', 'both', 'contaminated', 'site']

[MONITORING]
drift_flag: False

Direct CLI Invocation from Python

from envbert_agent.cli import main

main(["BEHP was detected in groundwater"])

Architecture: Graph Edges & Flow

START
  ↓
[preprocess] ────────────────┐
  ↓                           │
[envbert] ────────────────────┤
  ↓                           │
[arbitrate]                   │  Conditional Router:
  ├─ (quality < 0.4)          │      route = "review"
  ├─ (confidence ≥ 0.75) ─────┼──────→ route = "accept"
  └─ (confidence < 0.75) ─────┘      route = "llm"
       ↓
    ┌──────────┬──────────┬──────────┐
    ↓          ↓          ↓
  (review)   (accept)   (llm)
    ↓          ↓          ↓
    │ ─────────→[llm]←─────
            ↓
        [evaluate]
            ↓
        [explain]
            ↓
        [monitor]
            ↓
           END

License

MIT License

See the LICENSE file for details.

⚠️ Notes

  • This package depends on the EnvBert backbone.

  • Transformers version is constrained for TensorFlow compatibility.

  • Future versions may migrate to a PyTorch backend for improved compatibility and lighter installation footprint.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

envbert_agent-0.2.1.tar.gz (12.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

envbert_agent-0.2.1-py3-none-any.whl (18.1 kB view details)

Uploaded Python 3

File details

Details for the file envbert_agent-0.2.1.tar.gz.

File metadata

  • Download URL: envbert_agent-0.2.1.tar.gz
  • Upload date:
  • Size: 12.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for envbert_agent-0.2.1.tar.gz
Algorithm Hash digest
SHA256 a1e6148a1ba59a42641f83790f4e1975153c2b72e674d97d08ac0fd1a2b9703e
MD5 d3d6b8ddb5818d285e5bc242f3fcc081
BLAKE2b-256 f8e8e36a892c507121a42e0617ff7f328a39860e699e33e380f1502ecd738f72

See more details on using hashes here.

File details

Details for the file envbert_agent-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: envbert_agent-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 18.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for envbert_agent-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a01a4582f628c3c4c7678a1526a52fe14f69b3cd58736ac84bfbf42f0b51149e
MD5 473a382b095e0eca0bea536edf84dd7f
BLAKE2b-256 4b13facdd75edf97700e0c6612f39dfa05f701b31dd507922bf513e10e64449f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page