Skip to main content

A local-first, explainable AI agent framework with self-healing, detailed error diagnostics, and interactive tool-calling traces.

Project description

🔬 Explainable Agent Lab

A local-first, explainable agent framework designed to guide developers in building robust AI agents.

Building reliable agents is hard. LLMs hallucinate, get stuck in infinite loops, or fail to parse tools correctly. Explainable Agent Lab is built to solve this by focusing on explainability and guidance.

Key Features:

  • Show the Hidden Errors: Reveal exactly where and why an agent fails (e.g., low confidence, schema violations).
  • Self-Healing: The agent automatically analyzes its own errors and proposes alternative tool-based solutions.
  • Visual Terminal Tracking: Step-by-step interactive and colorful tracking using the rich library (--verbose).
  • Detailed Diagnostic Reports: Actionable suggestions on hallucination risks, loop patterns, and prompt improvements.

🚀 Quick Start

1. Install

# Recommended: Editable install
python -m venv .venv
# On Windows:
.venv\Scripts\activate
# On Mac/Linux:
# source .venv/bin/activate

pip install -e .[dev]

2. Connect Your Local LLM

You can use any OpenAI-compatible local server like Ollama or LM Studio.

  • Ollama: http://localhost:11434/v1 (e.g., model: ministral-3:14b)
  • LM Studio: http://localhost:1234/v1 (e.g., model: gpt-oss-20b)

Tip: Copy .env.example to .env to set your defaults.

3. Run the Agent

Example using Ollama:

python -m explainable_agent.cli \
  --base-url http://localhost:11434/v1 \
  --model ministral-3:14b \
  --task "calculate_math: (215*4)-12" \
  --verbose

💻 Using the Python API

Easily integrate the agent into your codebase or create custom tools using the @define_tool decorator.

Check out the examples/ directory:

Run an example:

python examples/custom_tool_usage.py

📊 Evaluation & Custom Datasets

Evaluate your fine-tuned models or custom datasets easily. The pipeline parses messy outputs, repairs broken JSON, and generates actionable Markdown reports.

1. Create a .jsonl dataset (See examples/custom_eval_sample.jsonl)

2. Run the evaluation:

python scripts/eval_hf_tool_calls.py \
  --dataset examples/custom_eval_sample.jsonl \
  --model ministral-3:14b

We also support standard benchmarks out of the box:

  • HF Tool Calls: data/evals/hf_xlam_fc_sample.jsonl
  • BFCL SQL: data/evals/bfcl_sql/BFCL_v3_sql.json
  • SWE-bench Lite: data/evals/swebench_lite_test.jsonl

🛠️ Built-in Tools

The agent comes with out-of-the-box tools ready to use: duckduckgo_search, calculate_math, read_text_file, list_workspace_files, now_utc, sqlite_init_demo, sqlite_list_tables, sqlite_describe_table, sqlite_query, sqlite_execute.


License: MIT | Current Release: v0.1.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

explainable_agent-0.1.0.tar.gz (33.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

explainable_agent-0.1.0-py3-none-any.whl (30.8 kB view details)

Uploaded Python 3

File details

Details for the file explainable_agent-0.1.0.tar.gz.

File metadata

  • Download URL: explainable_agent-0.1.0.tar.gz
  • Upload date:
  • Size: 33.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for explainable_agent-0.1.0.tar.gz
Algorithm Hash digest
SHA256 533053a380be83eee429a403bc571a0860eb28d958b393f06e52f92e3d866f69
MD5 e6c9b939d6d045e784302e9b4008a3ff
BLAKE2b-256 a530ca20cfc995f45b21f1b1d9fc39d25eee0849e644d932de07c4c307cbd9cf

See more details on using hashes here.

File details

Details for the file explainable_agent-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for explainable_agent-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 377bec3aded4e016e1548f7ff6d49b2e1486a68beb65a87c9443de9c758206dd
MD5 2f1ebe16692caf039a01aec3d8986a29
BLAKE2b-256 d9235db15485ebdb681f38b8ea386851d6311da3bb09faba0a2d34becaaf07e6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page