Skip to main content

Build self-improving AI agents that learn from experience

Project description

Kayba - Make your agents self-improve from experience

Agentic Context Engine (ACE)

GitHub stars Kayba Website Discord Twitter Follow Documentation

[!TIP]

Try our hosted solution for free at kayba.ai: automated agent self-improvement from your terminal. CLI + dashboard that analyzes traces, surfaces failures, and ships improvements directly from Claude Code, Codex, and more.

Kayba Pro


AI agents don't learn from experience. They repeat the same mistakes every session, forget what worked, and ignore what failed. ACE adds a persistent learning loop that makes them better over time.

ACE learns from mistakes in real time

The agent claims a seahorse emoji exists. ACE reflects on the error, and on the next attempt, the agent responds correctly — without human intervention.


Proven Results

Metric Result Context
2x consistency Doubles pass^4 on Tau2 airline benchmark 15 learned strategies, no reward signals
49% token reduction Browser automation costs cut nearly in half 10-run learning curve
$1.50 learning cost Claude Code translated 14k lines to TypeScript Zero build errors, all tests passing

Quick Start

uv add ace-framework

Option A — Interactive setup (recommended):

ace setup            # Walks you through model selection, API keys, and connection validation

Option B — Manual configuration:

export OPENAI_API_KEY="your-key"    # or ANTHROPIC_API_KEY, or any of 100+ supported providers

Then use it:

from ace import ACELiteLLM

agent = ACELiteLLM(model="gpt-4o-mini")

# First attempt — the agent may hallucinate
answer = agent.ask("Is there a seahorse emoji?")

# Feed a correction — ACE extracts a strategy and updates the Skillbook
agent.learn_from_feedback("There is no seahorse emoji in Unicode.")

# Subsequent calls benefit from the learned strategy
answer = agent.ask("Is there a seahorse emoji?")

# Inspect what the agent has learned
print(agent.get_strategies())

No fine-tuning, no training data, no vector database.

-> Quick Start Guide | -> Setup Guide


How It Works

ACE maintains a Skillbook — a persistent collection of strategies that evolves with every task. Three specialized roles manage the learning loop:

Role Responsibility
Agent Executes tasks, enhanced with Skillbook strategies
Reflector Analyzes execution traces to extract what worked and what failed
SkillManager Curates the Skillbook — adds, refines, and removes strategies

The Recursive Reflector is the key innovation: instead of summarizing traces in a single pass, it writes and executes Python code in a sandboxed environment to programmatically search for patterns, isolate errors, and iterate until it finds actionable insights.

flowchart LR
    Skillbook[(Skillbook)]
    Start([Task]) --> Agent[Agent]
    Agent <--> Environment[Environment]
    Environment -- Trace --> Reflector[Reflector]
    Reflector --> SkillManager[SkillManager]
    SkillManager -- Updates --> Skillbook
    Skillbook -. Strategies .-> Agent

All roles are backed by PydanticAI agents with structured output validation. PydanticAI routes to 100+ LLM providers through its LiteLLM integration, with native support for OpenAI, Anthropic, Google, Bedrock, Groq, and more.

Based on the ACE paper (Stanford & SambaNova) and Dynamic Cheatsheet.


Runners

Runner Class Description
LiteLLM ACELiteLLM Batteries-included agent with .ask(), .learn(), .save() — accepts any LiteLLM model string
Core ACE Full learning loop with batch epochs and evaluation
Trace Analyser TraceAnalyser Learn from pre-recorded traces without re-running tasks
browser-use BrowserUse Browser automation that improves with each run
LangChain LangChain Wrap any LangChain chain or agent with learning
Claude Code ClaudeCode Claude Code CLI tasks with learning
uv add ace-framework[browser-use]    # Browser automation
uv add ace-framework[langchain]      # LangChain
uv add ace-framework[logfire]        # Observability (auto-instruments PydanticAI)
uv add ace-framework[mcp]            # MCP server for IDE integration
uv add ace-framework[deduplication]  # Embedding-based skill deduplication

Have existing agent logs? Extract strategies from them directly:

from ace import ACELiteLLM

agent = ACELiteLLM(model="gpt-4o-mini")
agent.learn_from_traces(your_existing_traces)
print(agent.get_strategies())

-> Examples


Benchmarks

Tau2 — Multi-Step Agentic Tasks

tau2-bench by Sierra Research: airline domain tasks requiring tool use and policy adherence. Claude Haiku 4.5 agent, strategies learned on the train split with no reward signals, evaluated on the held-out test split.

Tau2 Benchmark — ACE doubles consistency at pass^4

pass^k = probability all k independent attempts succeed. ACE doubles consistency at pass^4 with 15 learned strategies.

Claude Code — Autonomous Translation

ACE + Claude Code translated this library from Python to TypeScript with zero supervision:

Metric Result
Duration ~4 hours
Commits 119
Lines written ~14,000
Build errors 0
Tests All passing
Learning cost ~$1.50

Pipeline Architecture

ACE is built on a composable pipeline engine. Each step declares what it requires and what it produces:

AgentStep -> EvaluateStep -> ReflectStep -> UpdateStep -> ApplyStep -> DeduplicateStep

Use learning_tail() for the standard learning sequence, or compose custom pipelines:

from ace import Pipeline, AgentStep, EvaluateStep, learning_tail

steps = [AgentStep(agent), EvaluateStep(env)] + learning_tail(reflector, skill_manager, skillbook)
pipeline = Pipeline(steps)

The pipeline engine (pipeline/) is framework-agnostic with requires/provides contracts, immutable context, and error isolation. See Pipeline Design and Architecture.


CLI

Command Description
ace setup Interactive setup — model selection, API keys, connection validation
ace models <query> Search available models with pricing
ace validate <model> Test a model connection
ace config Show current configuration
kayba Cloud CLI — upload traces, fetch insights, manage prompts
ace-mcp MCP server for IDE integration

Documentation


Contributing

Contributions are welcome. See Contributing Guidelines.


Built by Kayba and the open-source community.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ace_framework-0.10.0.tar.gz (212.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ace_framework-0.10.0-py3-none-any.whl (216.4 kB view details)

Uploaded Python 3

File details

Details for the file ace_framework-0.10.0.tar.gz.

File metadata

  • Download URL: ace_framework-0.10.0.tar.gz
  • Upload date:
  • Size: 212.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ace_framework-0.10.0.tar.gz
Algorithm Hash digest
SHA256 5d6acd406bd496fdbf75bc010a57a31fc3a9d4c416821061d9aaf75d04a493e0
MD5 23afefab9eed0afaa19d835f0a149f2a
BLAKE2b-256 c08c01664d64a440d21f0491138e84ad2db0bc28bac816be8e2d4b9e44332ee6

See more details on using hashes here.

Provenance

The following attestation bundles were made for ace_framework-0.10.0.tar.gz:

Publisher: publish.yml on kayba-ai/agentic-context-engine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ace_framework-0.10.0-py3-none-any.whl.

File metadata

  • Download URL: ace_framework-0.10.0-py3-none-any.whl
  • Upload date:
  • Size: 216.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ace_framework-0.10.0-py3-none-any.whl
Algorithm Hash digest
SHA256 444f5a2981ff3297e80c969da62265b31e5d4ae3bd7e4e756655735e73bb3938
MD5 fee7417718c1e460db4512a15c7cfc4b
BLAKE2b-256 78a43040573f32290fa53c0ab3fbeba06eeb830708942ef13d3b7056841d28ff

See more details on using hashes here.

Provenance

The following attestation bundles were made for ace_framework-0.10.0-py3-none-any.whl:

Publisher: publish.yml on kayba-ai/agentic-context-engine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page