Skip to main content

Analyze AI agents to understand their performance and get improvement suggestions to make them better

Project description

Agentune

CI PyPI version License Twitter Follow Discord


Open-source framework for continuously improving AI agents.

Agentune helps teams analyze, improve, and evaluate customer-facing AI agents through measurable, data-driven iterations — not guesswork.

Instead of tweaking prompts and hoping for the best, Agentune connects real conversations, context data, and simulations into a repeatable optimization loop that drives actual KPI improvements such as conversion, CSAT, and retention.


Why Agentune

Most agents are launched and left to stagnate — tuned by intuition, not evidence.

Agentune enables continuous agent improvement by combining analytics, optimization, and simulation in a single open framework:

  • Analyze – uncover what drives your agent’s KPIs up or down
  • Improve – generate actionable recommendations to lift performance
  • Simulate – safely test and benchmark improvements before deployment

The result: agents that don’t just respond — they learn what works.


The agentune-simulate library

Agentune Simulate is a separately installable library that enables you to create customer simulations to test and benchmark your agent's behavior before production.

Together with agentune, it forms the Analyze → Improve → Simulate loop — a disciplined framework for building smarter, higher-performing AI agents.

A future version of agentune-simulate will merge it into the main agentune package.


Real-World Use Cases

Agentune is built for teams who want to move beyond trial-and-error:

  • AI platform / infra teams managing production-grade agents across multiple domains or use cases
  • ML / data teams accountable for KPI impact, not just model accuracy
  • Product / ops teams who need to measure and harden conversational behavior before it reaches users

Common scenarios:

  • Diagnose why conversion or CSAT is dropping
  • Quantify which behaviors, intents, or flows impact KPIs
  • Test new prompt or policy versions safely
  • Continuously improve deployed agents over time

Agentune Analyze & Improve

Turn real conversations into insights that measurably improve your AI agents.

Agentune Analyze & Improve helps teams discover what drives an agent’s KPIs up or down — and generate concrete recommendations to enhance performance.
It transforms messy operational data into interpretable, data-driven actions that actually move business metrics.


Why It Matters

Most AI agents are optimized by intuition: a few sample chats, some prompt edits, and best guesses.

Agentune replaces guesswork with evidence.
Using structured and unstructured data from real conversations, it:

  • Identifies patterns that correlate with KPI outcomes
  • Surfaces interpretable insights (not opaque scores)
  • Recommends targeted changes to prompts, policies, and logic

No more trial-and-error tuning — just measurable improvement grounded in data.

For example: suppose you built a sales agent and now have a dataset of conversations with labeled outcomes as win, undecided, or lost. Using Agentune Analyze & Improve, you can discover insights showing which patterns or intents correlate with those outcomes and receive concrete recommendations to refine the agent’s playbook — for instance, improving how it handles discounts, competitor mentions, or shipping questions.

How It Works

Agentune Analyze & Improve follows a transparent, two-step process:

1. Analyze

  • Ingests conversations, outcomes, and optional context data (e.g., product, policy, CRM).
  • Generates semantic and structural features that capture patterns in language, behavior, or flow.
  • Selects statistically significant features correlated with KPI changes — these become your drivers of performance.

Example insights:

  • “Mentions of competitors early in chat increase conversion probability.”
  • “Discount discussion combined with shipping-time questions lowers CSAT.”

2. Improve

  • Maps the discovered drivers into actionable recommendations — changes to prompts, tool usage, escalation logic, or playbooks.
  • Outputs a ranked list of improvement opportunities, each linked to its supporting data.

These recommendations can then be validated using Agentune Simulate before deployment.


Example Usage

  1. Getting Started - 01_getting_started.ipynb for an introductory walkthrough of library fundamentals
  2. End-to-End Script Example - e2e_script_example.md - a runnable example executing the entire analysis workflow
  3. Advanced Examples - advanced_examples.md for customizing components, using LLM requests caching, and advanced workflows

Testing & Costs

We've tested Agentune Analyse with the combination of OpenAI o3 and gpt-4o-mini. In our tests, the cost per conversation was approximately 5-10 cents per conversation.

Installation

pip install agentune

Requirements

  • Python ≥ 3.12
  • Note for Mac users: If you encounter errors related to lightgbm, you may need to install OpenMP first: brew install libomp. See the LightGBM macOS installation guide for details.

Key Features

  • 🧩 Feature Generation – semantic, structural, and behavioral signals derived from real interactions
  • 📈 Feature Selection – statistical and semantic correlation with target KPIs
  • 💡 Actionable Insights – interpretable drivers with examples and metrics
  • 🧠 Context Awareness (upcoming) – integrates CRM, product, and policy metadata for deeper understanding

Roadmap

Current focus: advancing Analyze & Improve with structured, context-aware optimization.

Planned milestones:

  • Context-aware feature generation and insight discovery
  • Integration of context features into the recommendation layer for targeted improvement actions
  • Expanded evaluation and visualization tooling for Analyze & Improve results
  • Visualization tools for insight exploration
  • Seamless flow into agentune-simulate for validating improvements

Longer-term:

  • Multi KPI analytics: understand how improving one KPI impacts other KPIs and account for that in the suggested improvement recommendations.
  • Optional multi-agent analytics and cross-agent benchmarking

Contributing

We welcome contributions from engineers who care about robust, measurable agents.

  • Open issues for bugs, integrations, or feature proposals
  • Early adopters: reach us at agentune-dev@sparkbeyond.com
  • 💬 Join our community on Discord to connect with maintainers, share ideas, and get support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentune-0.1.1.tar.gz (145.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentune-0.1.1-py3-none-any.whl (187.1 kB view details)

Uploaded Python 3

File details

Details for the file agentune-0.1.1.tar.gz.

File metadata

  • Download URL: agentune-0.1.1.tar.gz
  • Upload date:
  • Size: 145.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agentune-0.1.1.tar.gz
Algorithm Hash digest
SHA256 d598f4c3c0261ce6ca6d237bcbf8e7b54c9caf92b679a10f263b8aebd09972a1
MD5 304b467cc7192390663f76bd9a1fe04f
BLAKE2b-256 d70647c21fc34746147f0a67d81e61834e2ee9e0aabf8c11cca3089e0a39aca3

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentune-0.1.1.tar.gz:

Publisher: publish.yml on SparkBeyond/agentune

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file agentune-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: agentune-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 187.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agentune-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 13b6db79795ec60113319bf0faaef22680572b1479f67e4cf695b40b275cfd00
MD5 2f4bbd1933289e38a9616cfaf345b51a
BLAKE2b-256 da02b9837d927dbfcc00362a748726866656694ca23f4730b4cb702da3f1947c

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentune-0.1.1-py3-none-any.whl:

Publisher: publish.yml on SparkBeyond/agentune

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page