Build self-improving AI agents that learn from experience

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

kayba

These details have not been verified by PyPI

Project links

Homepage

Project description

Agentic Context Engine (ACE)

GitHub stars License: MIT

AI agents that get smarter with every task 🧠

Agentic Context Engine learns from your agent's successes and failures. Just plug in and watch your agents improve.

Star ⭐️ this repo if you find it useful!

🤖 LLM Quickstart

Direct your favorite coding agent (Cursor, Claude Code, Codex, etc) to Agents.md
Prompt away!

✋ Quick Start

1. Install

pip install ace-framework

2. Set Your API Key

export OPENAI_API_KEY="your-api-key"
# Or use Claude, Gemini, or 100+ other providers

3. Create Your First ACE Agent

from ace import LiteLLMClient, Generator, Playbook, Sample

# Initialize with any LLM
llm = LiteLLMClient(model="gpt-4o-mini")
generator = Generator(llm)

# Use it like a normal LLM (no learning yet)
result = generator.generate(
    question="What is 2+2?",
    context="Be direct"
)
print(f"Answer: {result.final_answer}")

That's it! Now let's make it learn and improve:

from ace import OfflineAdapter, Reflector, Curator, SimpleEnvironment

# Create ACE learning system
playbook = Playbook()
adapter = OfflineAdapter(
    playbook=playbook,
    generator=generator,
    reflector=Reflector(llm),
    curator=Curator(llm)
)

# Teach it from examples (it learns patterns)
samples = [
    Sample(question="What is 2+2?", ground_truth="4"),
    Sample(question="Capital of France?", ground_truth="Paris"),
]

results = adapter.run(samples, SimpleEnvironment(), epochs=1)
print(f"✅ Learned {len(playbook.bullets())} strategies!")

# Now use the improved agent
result = generator.generate(
    question="What is 5+3?",
    playbook=playbook  # ← Uses learned strategies
)
print(f"🧠 Smarter answer: {result.final_answer}")

# Save and reuse later
playbook.save_to_file("my_agent.json")

🎉 Your agent just got smarter! It learned from examples and improved.

Want more? Check out:

Why Agentic Context Engine (ACE)?

AI agents make the same mistakes repeatedly.

ACE enables agents to learn from execution feedback: what works, what doesn't, and continuously improve.
No training data, no fine-tuning, just automatic improvement.

Clear Benefits

📈 20-35% Better Performance: Proven improvements on complex tasks
🧠 Self-Improving: Agents get smarter with each task
🔄 No Context Collapse: Preserves valuable knowledge over time
🚀 100+ LLM Providers: Works with OpenAI, Anthropic, Google, and more
📊 Production Observability: Built-in Opik integration for enterprise monitoring

Demos

🌊 The Seahorse Emoji Challenge

A challenge where LLMs often hallucinate that a seahorse emoji exists (it doesn't). Watch ACE learn from its own mistakes in real-time. This demo shows how ACE handles the infamous challenge!

Kayba Test Demo

In this example:

Round 1: The agent incorrectly outputs 🐴 (horse emoji)
Self-Reflection: ACE reflects without any external feedback
Round 2: With learned strategies from ACE, the agent successfully realizes there is no seahorse emoji

Try it yourself:

python examples/kayba_ace_test.py

🌐 Browser Use Automation A/B Test

A real-world comparison where both Browser Use agents check 10 domains for availability using browser automation. Same prompt, same Browser Use setup—but the ACE agent autonomously generates strategies from execution feedback.

Browser Use Demo Results

Default Agent Behavior:

Repeats failed actions throughout all runs
30% success rate (3/10 runs)
38.8 steps per domain on average

ACE Agent Behavior:

First two domain checks: Performs similar to baseline (double-digit steps per check)
Then learns from mistakes and identifies the pattern
Remaining checks: Consistent 3-step completion
Agent autonomously figured out the optimal approach

Metric	Default	ACE
Success rate	30%	100%
Avg steps per domain	38.8	6.9
Token cost	1776k	605k (incl. ACE)

Try it yourself:

# Run baseline version
uv run python examples/browser-use/baseline_domain_checker.py

# Run ACE-enhanced version
uv run python examples/browser-use/ace_domain_checker.py

How does Agentic Context Engine (ACE) work?

Based on the ACE research framework from Stanford & SambaNova.

ACE uses three specialized roles that work together:

🎯 Generator - Executes tasks using learned strategies from the playbook
🔍 Reflector - Analyzes what worked and what didn't after each execution
📝 Curator - Updates the playbook with new strategies based on reflection

ACE teaches your agent and internalises:

✅ Successes → Extract patterns that work
❌ Failures → Learn what to avoid
🔧 Tool usage → Discover which tools work best for which tasks
🎯 Edge cases → Remember rare scenarios and how to handle them

The magic happens in the Playbook—a living document of strategies that evolves with experience.
Key innovation: All learning happens in context through incremental updates—no fine-tuning, no training data, and complete transparency into what your agent learned.

---
config:
  look: neo
  theme: neutral
---
flowchart LR
    Playbook[("`**📚 Playbook**<br>(Evolving Context)<br><br>•Strategy Bullets<br> ✓ Helpful strategies <br>✗ Harmful patterns <br>○ Neutral observations`")]
    Start(["**📝Query** <br>User prompt or question"]) --> Generator["**⚙️Generator** <br>Executes task using playbook"]
    Generator --> Reflector
    Playbook -. Provides Context .-> Generator
    Environment["**🌍 Task Environment**<br>Evaluates answer<br>Provides feedback"] -- Feedback+ <br>Optional Ground Truth --> Reflector
    Reflector["**🔍 Reflector**<br>Analyzes and provides feedback what was helpful/harmful"]
    Reflector --> Curator["**📝 Curator**<br>Produces improvement deltas"]
    Curator --> DeltaOps["**🔀Merger** <br>Updates the playbook with deltas"]
    DeltaOps -- Incremental<br>Updates --> Playbook
    Generator <--> Environment

Installation Options

# Basic installation
pip install ace-framework

# With demo support (browser automation)
pip install ace-framework[demos]

# With LangChain support
pip install ace-framework[langchain]

# With local model support
pip install ace-framework[transformers]

# With all features
pip install ace-framework[all]

# Development
pip install ace-framework[dev]

# Development from source (contributors) - UV Method (10-100x faster)
git clone https://github.com/kayba-ai/agentic-context-engine
cd agentic-context-engine
uv sync

# Development from source (contributors) - Traditional Method
git clone https://github.com/kayba-ai/agentic-context-engine
cd agentic-context-engine
pip install -e .

Configuration

ACE works with any LLM provider through LiteLLM:

# OpenAI
client = LiteLLMClient(model="gpt-4o")

# Anthropic Claude
client = LiteLLMClient(model="claude-3-5-sonnet-20241022")

# Google Gemini
client = LiteLLMClient(model="gemini-pro")

# Ollama (local)
client = LiteLLMClient(model="ollama/llama2")

# With fallbacks for reliability
client = LiteLLMClient(
    model="gpt-4",
    fallbacks=["claude-3-haiku", "gpt-3.5-turbo"]
)

Observability with Opik

ACE includes built-in Opik integration for production monitoring and debugging.

Quick Start

# Install with Opik support
pip install ace-framework opik

# Set your Opik API key (or use local deployment)
export OPIK_API_KEY="your-api-key"
export OPIK_PROJECT_NAME="ace-project"

What Gets Tracked

When Opik is available, ACE automatically logs:

Generator: Input questions, reasoning, and final answers
Reflector: Error analysis and bullet classifications
Curator: Playbook updates and delta operations
Playbook Evolution: Changes to strategies over time

Viewing Traces

# Opik tracing is automatic - just run your ACE code normally
from ace import Generator, Reflector, Curator, Playbook
from ace.llm_providers import LiteLLMClient

# All role interactions are automatically tracked
generator = Generator(llm_client)
output = generator.generate(
    question="What is 2+2?",
    context="Show your work",
    playbook=playbook
)
# View traces at https://www.comet.com/opik or your local Opik instance

Graceful Degradation

If Opik is not installed or configured, ACE continues to work normally without tracing. No code changes needed.

📊 Benchmarks

Evaluate ACE performance with scientific rigor using our comprehensive benchmark suite.

Quick Benchmark

# Compare baseline vs ACE on any benchmark
uv run python scripts/run_benchmark.py simple_qa --limit 50 --compare

# Run with proper train/test split (prevents overfitting)
uv run python scripts/run_benchmark.py finer_ord --limit 100

# Baseline evaluation (no ACE learning)
uv run python scripts/run_benchmark.py hellaswag --limit 50 --skip-adaptation

Available Benchmarks

Benchmark	Description	Domain
simple_qa	Question Answering (SQuAD)	General
finer_ord	Financial Named Entity Recognition	Finance
mmlu	Massive Multitask Language Understanding	General Knowledge
hellaswag	Commonsense Reasoning	Common Sense
arc_easy/arc_challenge	AI2 Reasoning Challenge	Reasoning

Evaluation Modes

ACE Mode: Train/test split with learning (shows true generalization)
Baseline Mode: Direct evaluation without learning (--skip-adaptation)
Comparison Mode: Side-by-side baseline vs ACE (--compare)

The benchmark system prevents overfitting with automatic 80/20 train/test splits and provides overfitting analysis to ensure honest metrics.

→ Full Benchmark Documentation

Documentation

Quick Start Guide - Get running in 5 minutes
API Reference - Complete API documentation
Examples - Ready-to-run code examples
ACE Framework Guide - Deep dive into Agentic Context Engineering
Prompt Engineering - Advanced prompt techniques
Changelog - See recent changes

Contributing

We love contributions! Check out our Contributing Guide to get started.

Acknowledgment

Based on the ACE paper and inspired by Dynamic Cheatsheet.

If you use ACE in your research, please cite:

@article{zhang2024ace,title={Agentic Context Engineering},author={Zhang et al.},journal={arXiv:2510.04618},year={2024}}

⭐ Star this repo if you find it useful!
Built with ❤️ by Kayba and the open-source community.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

kayba

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.10.0

Apr 13, 2026

0.9.7

Apr 11, 2026

0.9.6

Apr 11, 2026

0.9.5

Apr 11, 2026

0.9.4

Apr 11, 2026

0.9.3

Apr 1, 2026

0.9.2

Mar 31, 2026

0.9.1

Mar 26, 2026

0.9.0

Mar 26, 2026

0.8.9

Mar 18, 2026

0.8.8

Mar 17, 2026

0.8.7

Mar 17, 2026

0.8.6

Mar 12, 2026

0.8.5

Mar 4, 2026

0.8.4

Feb 27, 2026

0.8.3

Feb 21, 2026

0.8.2

Feb 18, 2026

0.8.1

Feb 18, 2026

0.8.0

Feb 17, 2026

0.7.3

Feb 4, 2026

0.7.2

Jan 26, 2026

0.7.1

Dec 8, 2025

0.7.0

Dec 4, 2025

0.6.0

Nov 29, 2025

0.5.1

Nov 25, 2025

0.5.0

Nov 20, 2025

This version

0.4.0

Nov 8, 2025

0.2.0

Oct 16, 2025

0.1.0

Oct 15, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ace_framework-0.4.0.tar.gz (133.5 kB view details)

Uploaded Nov 8, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ace_framework-0.4.0-py3-none-any.whl (68.7 kB view details)

Uploaded Nov 8, 2025 Python 3

File details

Details for the file ace_framework-0.4.0.tar.gz.

File metadata

Download URL: ace_framework-0.4.0.tar.gz
Upload date: Nov 8, 2025
Size: 133.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ace_framework-0.4.0.tar.gz
Algorithm	Hash digest
SHA256	`716a26853e47baefda460d753af415381c547d41be0d0fadab421aabff25bfa0`
MD5	`dc0e3483c49d3506eefcb5ff71a4f15c`
BLAKE2b-256	`eaaf81b4fe48079a5e0e075cb61528272f32fca897405343924daa080ad6459c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ace_framework-0.4.0.tar.gz:

Publisher: publish.yml on kayba-ai/agentic-context-engine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ace_framework-0.4.0.tar.gz
- Subject digest: 716a26853e47baefda460d753af415381c547d41be0d0fadab421aabff25bfa0
- Sigstore transparency entry: 685366348
- Sigstore integration time: Nov 8, 2025
Source repository:
- Permalink: kayba-ai/agentic-context-engine@2a4a23d324eb089a48b3da7ccf78ea3752f45dcf
- Branch / Tag: refs/tags/v0.4.0
- Owner: https://github.com/kayba-ai
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@2a4a23d324eb089a48b3da7ccf78ea3752f45dcf
- Trigger Event: release

File details

Details for the file ace_framework-0.4.0-py3-none-any.whl.

File metadata

Download URL: ace_framework-0.4.0-py3-none-any.whl
Upload date: Nov 8, 2025
Size: 68.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ace_framework-0.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a8ac7655149ed34fd349a9490e9844630ac54c5b7839d83976b7d86483672751`
MD5	`be540edcb9b533e2942f7d9f6443b27e`
BLAKE2b-256	`0a64df312bc6ae4def800b34466d9f8c8cda671c15816e445b40ebaab9cf9d5e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ace_framework-0.4.0-py3-none-any.whl:

Publisher: publish.yml on kayba-ai/agentic-context-engine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ace_framework-0.4.0-py3-none-any.whl
- Subject digest: a8ac7655149ed34fd349a9490e9844630ac54c5b7839d83976b7d86483672751
- Sigstore transparency entry: 685366350
- Sigstore integration time: Nov 8, 2025
Source repository:
- Permalink: kayba-ai/agentic-context-engine@2a4a23d324eb089a48b3da7ccf78ea3752f45dcf
- Branch / Tag: refs/tags/v0.4.0
- Owner: https://github.com/kayba-ai
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@2a4a23d324eb089a48b3da7ccf78ea3752f45dcf
- Trigger Event: release

ace-framework 0.4.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Agentic Context Engine (ACE)

🤖 LLM Quickstart

✋ Quick Start

1. Install

2. Set Your API Key

3. Create Your First ACE Agent

Why Agentic Context Engine (ACE)?

Clear Benefits

Demos

🌊 The Seahorse Emoji Challenge

🌐 Browser Use Automation A/B Test

How does Agentic Context Engine (ACE) work?

Installation Options

Configuration

Observability with Opik

Quick Start

What Gets Tracked

Viewing Traces

Graceful Degradation

📊 Benchmarks

Quick Benchmark

Available Benchmarks

Evaluation Modes

Documentation

Contributing

Acknowledgment

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance