Skip to main content

A framework-agnostic library for evaluating and improving AI agents

Project description

EvoLoop

EvoLoop is a framework-agnostic Python library designed to bring Self-Evolving capabilities to any AI Agent or LLM workflow.

Unlike other frameworks that focus on building agents (like LangChain, CrewAI, or Agno), EvoLoop focuses exclusively on evaluating and optimizing them. It acts as a "gym" for your agents, providing tools to capture interactions, evaluate performance, and learn from mistakes.

✨ Features

  • Framework Agnostic: Works with LangChain, LangGraph, AutoGen, raw OpenAI API, or any other stack
  • Zero Configuration: Just add a decorator and start capturing traces
  • Lightweight: No heavy dependencies, SQLite storage by default
  • Multiple Integration Modes: Decorator, wrapper, or manual logging

📦 Installation

pip install evoloop

Or install from source:

git clone https://github.com/yourusername/evoloop.git
cd evoloop
pip install -e .

🚀 Quick Start

Option 1: Decorator (Simplest)

from evoloop import monitor

@monitor
def my_agent(question: str) -> str:
    # Your agent logic here
    return "Agent response"

# Use as normal - traces are captured automatically
response = my_agent("What is the capital of France?")

Option 2: Wrapper (For LangGraph/LangChain)

from evoloop import wrap
from langgraph.prebuilt import create_react_agent

agent = create_react_agent(model, tools)
monitored_agent = wrap(agent, name="my_agent")

# Use as normal
result = monitored_agent.invoke({"messages": [...]})

Option 3: Manual Logging

from evoloop import log

# After your agent runs
trace = log(
    input_data=user_question,
    output_data=agent_response,
    metadata={"user_id": "123"}
)

📊 Viewing Traces

from evoloop import get_storage

storage = get_storage()

# Get recent traces
traces = storage.list_traces(limit=10)
for trace in traces:
    print(f"[{trace.status}] {trace.input[:50]}...")

# Count by status
print(f"Total: {storage.count()}")
print(f"Errors: {storage.count(status='error')}")

🎯 Adding Context (Business Rules)

Attach context data for evaluation against business rules:

from evoloop import monitor
from evoloop.tracker import set_context
from evoloop.types import TraceContext

@monitor
def debt_agent(user_message: str, customer_data: dict) -> str:
    # Attach API data as context
    set_context(TraceContext(
        data=customer_data,
        source="customer_api"
    ))
    
    # Agent logic...
    return response

🛣️ Roadmap

  • Phase 1: Tracker Module (capture traces)
  • Phase 2: Judge Module (binary evaluation)
  • Phase 3: Reporter Module (error taxonomy)
  • Phase 4: CLI (evoloop eval, evoloop report)
  • Phase 5: Self-Evolution (prompt optimization)

📚 Philosophy

EvoLoop is inspired by the principles in "LLM Evals: Everything You Need to Know" by Hamel Husain:

  • Binary evaluations (Pass/Fail) over Likert scales (1-5)
  • Error analysis as the core of improvement
  • Domain-specific criteria over generic metrics

🧪 Development

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest tests/ -v

# Type checking
mypy src/evoloop

# Linting
ruff check src/

📄 License

MIT License - see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

evoloop-0.2.0.tar.gz (24.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

evoloop-0.2.0-py3-none-any.whl (13.9 kB view details)

Uploaded Python 3

File details

Details for the file evoloop-0.2.0.tar.gz.

File metadata

  • Download URL: evoloop-0.2.0.tar.gz
  • Upload date:
  • Size: 24.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for evoloop-0.2.0.tar.gz
Algorithm Hash digest
SHA256 d0c32f532c7b897e04eb5d756841e2990d4338bc7029b4d8912f5b7e51d15959
MD5 ea6d6a666b7e45b98efd0b63e9bc1cad
BLAKE2b-256 812bd54f0a692dbf7e56a4cbf46e07f9e3427dfc797cf7b5ff502d2686605e9f

See more details on using hashes here.

File details

Details for the file evoloop-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: evoloop-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 13.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for evoloop-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 def25d95949712eeb30fac69df31d4bb2052588516e253e2266db30a4aa31326
MD5 672c9856184d4088cf586b517a32b87e
BLAKE2b-256 c5a387e4a799e805dc57bfc703d5034c43933d390a6b12f164b15ff9cd17f19c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page