Skip to main content

Build self-improving AI agents that learn from experience

Project description

Kayba Logo

Agentic Context Engine (ACE)

GitHub stars Discord Twitter Follow PyPI version Python 3.12 License: MIT

AI agents that get smarter with every task

⭐ Star this repo if you find it useful!


What is ACE?

ACE enables AI agents to learn from their execution feedback—what works, what doesn't—and continuously improve. No fine-tuning, no training data, just automatic in-context learning.

The framework maintains a Skillbook: a living document of strategies that evolves with each task. When your agent succeeds, ACE extracts patterns. When it fails, ACE learns what to avoid. All learning happens transparently in context.

  • Self-Improving: Agents autonomously get smarter with each task
  • 20-35% Better Performance: Proven improvements on complex tasks
  • 49% Token Reduction: Demonstrated in browser automation benchmarks
  • No Context Collapse: Preserves valuable knowledge over time

LLM Quickstart

  1. Direct your favorite coding agent (Cursor, Claude Code, Codex, etc) to Quick Start Guide
  2. Prompt away!

Quick Start

1. Install

pip install ace-framework

2. Set API Key

export OPENAI_API_KEY="your-api-key"

3. Run

from ace import ACELiteLLM

agent = ACELiteLLM(model="gpt-4o-mini")

answer = agent.ask("What does Kayba's ACE framework do?")
print(answer)  # "ACE allows AI agents to remember and learn from experience!"

Done! Your agent learns automatically from each interaction.

→ Quick Start Guide | → Setup Guide


Use Cases

Claude Code with Learning → Quick Start

Run coding tasks with Claude Code while ACE learns patterns from each execution, building expertise over time for your specific codebase and workflows.

Automated System Prompting

The Skillbook acts as an evolving system prompt that automatically improves based on execution feedback—no manual prompt engineering required.

Enhance Existing Agents

Wrap your existing agent (browser-use, LangChain, custom) with ACE learning. Your agent executes tasks normally while ACE analyzes results and builds a skillbook of effective strategies.

Build Self-Improving Agents

Create new agents with built-in learning for customer support, data extraction, code generation, research, content creation, and task automation.


Demos

The Seahorse Emoji Challenge

A challenge where LLMs often hallucinate that a seahorse emoji exists (it doesn't).

Seahorse Emoji ACE Demo

In this example:

  1. The agent incorrectly outputs a horse emoji
  2. ACE reflects on the mistake without external feedback
  3. On the second attempt, the agent correctly realizes there is no seahorse emoji

→ Try it yourself

Tau2 Benchmark

Evaluated on the airline domain of τ2-bench (Sierra Research) — a benchmark for multi-step agentic tasks requiring tool use and policy adherence. Agent: Claude Haiku 4.5. Strategies learned on the train split with no reward signals; all results on the held-out test split.

pass^k = probability that all k independent attempts succeed. Higher k is a stricter test of agent consistency.

Tau2 Benchmark Results - Haiku 4.5

ACE doubles agent consistency at pass^4 using only 15 learned strategies — gains compound as the bar gets higher.

Browser Automation

Online Shopping Demo: ACE vs baseline agent shopping for 5 grocery items.

Online Shopping Demo Results

In this example:

  • ACE learns to navigate the website over 10 attempts
  • Performance stabilizes and step count decreases by 29.8%
  • Token costs reduce 49.0% for base agent and 42.6% including ACE overhead

→ Try it yourself & see all demos

Claude Code Loop

In this example, Claude Code is enhanced with ACE and self-reflects after each execution while translating the ACE library from Python to TypeScript.

Python → TypeScript Translation:

Metric Result
Duration ~4 hours
Commits 119
Lines written ~14k
Outcome Zero build errors, all tests passing
API cost ~$1.5 (Sonnet for learning)

→ Claude Code Loop


Integrations

ACE integrates with popular agent frameworks:

Integration ACE Class Use Case
LiteLLM ACELiteLLM Simple self-improving agent
LangChain ACELangChain Wrap LangChain chains/agents
browser-use ACEAgent Browser automation
Claude Code ACEClaudeCode Claude Code CLI
ace-learn CLI ACEClaudeCode Learn from Claude Code sessions
Opik OpikIntegration Production monitoring and cost tracking

→ Integration Guide | → Examples


How Does ACE Work?

Inspired by the ACE research framework from Stanford & SambaNova.

ACE enables agents to learn from execution feedback — what works, what doesn't — and continuously improve. No fine-tuning, no training data, just automatic in-context learning. Three specialized roles work together:

  1. Agent — Your agent, enhanced with strategies from the Skillbook
  2. Reflector — Analyzes execution traces to extract learnings. In recursive mode, the Reflector writes and runs Python code in a sandboxed REPL to programmatically query traces — finding patterns, errors, and insights that single-pass analysis misses
  3. SkillManager — Curates the Skillbook: adds new strategies, refines existing ones, and removes outdated patterns based on the Reflector's analysis

The key innovation is the Recursive Reflector — instead of summarizing traces in a single pass, it writes and executes Python code in a sandboxed environment to programmatically explore agent execution traces. It can search for patterns, isolate errors, query sub-agents for deeper analysis, and iterate until it finds actionable insights. These insights flow into the Skillbook — a living collection of strategies that evolves with every task.

flowchart LR
    Skillbook[(Skillbook<br>Learned Strategies)]
    Start([Query]) --> Agent[Agent<br>Enhanced with Skillbook]
    Agent <--> Environment[Task Environment<br>Evaluates & provides feedback]
    Environment -- Feedback --> Reflector[Reflector<br>Analyzes traces via<br>sandboxed code execution]
    Reflector --> SkillManager[SkillManager<br>Curates strategies]
    SkillManager -- Updates --> Skillbook
    Skillbook -. Injects context .-> Agent

Documentation


Contributing

We love contributions! Check out our Contributing Guide to get started.


Acknowledgment

Inspired by the ACE paper and Dynamic Cheatsheet.


⭐ Star this repo if you find it useful!

Built with ❤️ by Kayba and the open-source community.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ace_framework-0.8.0.tar.gz (222.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ace_framework-0.8.0-py3-none-any.whl (188.6 kB view details)

Uploaded Python 3

File details

Details for the file ace_framework-0.8.0.tar.gz.

File metadata

  • Download URL: ace_framework-0.8.0.tar.gz
  • Upload date:
  • Size: 222.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ace_framework-0.8.0.tar.gz
Algorithm Hash digest
SHA256 21fe3b3adfae3ec0ef8adf17c2831bf8562f404ea2912a8d1d23e36d762d0e9d
MD5 9ace2af939660b6d3cdca44f10a236d3
BLAKE2b-256 1011d5334872d27539f11b4cb120bc0b11da66bce87d42eac98078408b838231

See more details on using hashes here.

Provenance

The following attestation bundles were made for ace_framework-0.8.0.tar.gz:

Publisher: publish.yml on kayba-ai/agentic-context-engine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ace_framework-0.8.0-py3-none-any.whl.

File metadata

  • Download URL: ace_framework-0.8.0-py3-none-any.whl
  • Upload date:
  • Size: 188.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ace_framework-0.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 215d2298d5d5fdd394c03c8275ec49b38e4533ac5757c5bf59cc2762d0541c55
MD5 8759c9d7d31404c14c042702d4c3ffbd
BLAKE2b-256 8c56235a9a541c06e48c39a6681d281a08aa55bb0c0aeaa77287b986d8d5d98e

See more details on using hashes here.

Provenance

The following attestation bundles were made for ace_framework-0.8.0-py3-none-any.whl:

Publisher: publish.yml on kayba-ai/agentic-context-engine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page