Skip to main content

Build self-improving AI agents that learn from experience

Project description

Agentic Context Engine (ACE) ๐Ÿš€

Build self-improving AI agents that learn from experience

PyPI version Python 3.9+ License: MIT Tests Paper

๐Ÿง  ACE is a framework for building AI agents that get smarter over time by learning from their mistakes and successes.

๐Ÿ’ก Based on the paper "Agentic Context Engineering" from Stanford/SambaNova - ACE helps your LLM agents build a "playbook" of strategies that improves with each task.

๐Ÿ”Œ Works with any LLM - OpenAI, Anthropic Claude, Google Gemini, and 100+ more providers out of the box!

Quick Links

๐Ÿ“ฆ PyPI Package | ๐Ÿ“š Documentation | ๐Ÿ› Issues | ๐Ÿ’ฌ Discussions

Quick Start

Minimum Python 3.9 required

Install ACE:

# Basic installation
pip install ace-framework

# With LangChain support (for advanced routing & chains)
pip install ace-framework[langchain]

# For development
pip install -r requirements.txt

Set up your API key:

# Copy the example environment file
cp .env.example .env

# Add your OpenAI key (or Anthropic, Google, etc.)
echo "OPENAI_API_KEY=your-key-here" >> .env

Run your first agent:

from ace import LiteLLMClient, OfflineAdapter, Generator, Reflector, Curator
from ace import Playbook, Sample, TaskEnvironment, EnvironmentResult
from dotenv import load_dotenv

load_dotenv()

# Create your agent with any LLM
client = LiteLLMClient(model="gpt-3.5-turbo")  # or claude-3, gemini-pro, etc.

# Set up ACE components
adapter = OfflineAdapter(
    playbook=Playbook(),
    generator=Generator(client),
    reflector=Reflector(client),
    curator=Curator(client)
)

# Define a simple task
class SimpleEnv(TaskEnvironment):
    def evaluate(self, sample, output):
        correct = sample.ground_truth.lower() in output.final_answer.lower()
        return EnvironmentResult(
            feedback="Correct!" if correct else "Try again",
            ground_truth=sample.ground_truth
        )

# Train your agent
samples = [
    Sample(question="What is 2+2?", ground_truth="4"),
    Sample(question="Capital of France?", ground_truth="Paris"),
]

results = adapter.run(samples, SimpleEnv(), epochs=1)
print(f"Agent learned {len(adapter.playbook.bullets())} strategies!")

How It Works

ACE uses three AI "roles" that work together to help your agent improve:

  1. ๐ŸŽฏ Generator - Tries to solve tasks using the current playbook
  2. ๐Ÿ” Reflector - Analyzes what went wrong (or right)
  3. ๐Ÿ“ Curator - Updates the playbook with new strategies

Think of it like a sports team reviewing game footage to get better!

The ACE Learning Loop

flowchart TD
    Start([New Task/Question]) --> Generator

    subgraph Playbook ["๐Ÿ“š Playbook (Evolving Context)"]
        Bullets["๐Ÿ“ Strategy Bullets<br/>โ€ข Helpful strategies โœ“<br/>โ€ข Harmful patterns โœ—<br/>โ€ข Neutral observations โ—‹"]
    end

    Generator["๐ŸŽฏ Generator<br/>Uses playbook strategies<br/>to produce answer"] --> Output[Answer Output]

    Playbook -.->|Provides Context| Generator

    Output --> Environment["๐ŸŒ Task Environment<br/>Evaluates answer<br/>Provides feedback"]

    Environment --> Reflector["๐Ÿ” Reflector<br/>Analyzes outcome<br/>Tags bullet contributions:<br/>โ€ข Which helped?<br/>โ€ข Which hurt?<br/>โ€ข What's missing?"]

    Reflector --> Curator["๐Ÿ“ Curator<br/>Emits delta operations"]

    Curator --> DeltaOps{{"๐Ÿ”„ Delta Operations<br/>ADD new strategies<br/>UPDATE existing ones<br/>TAG helpful/harmful<br/>REMOVE outdated"}}

    DeltaOps -->|Incremental<br/>Updates| Playbook

    Environment -->|Ground Truth +<br/>Feedback| Reflector

    style Playbook fill:#e1f5fe
    style Generator fill:#fff3e0
    style Reflector fill:#f3e5f5
    style Curator fill:#e8f5e9
    style DeltaOps fill:#fff9c4

Key Insights:

  • Incremental Learning: The playbook evolves through small delta updates, not complete rewrites
  • No Context Collapse: Strategies are preserved and refined, preventing loss of valuable knowledge
  • Self-Improving: Each task makes the agent smarter by updating its strategy playbook
  • Three-Role Architecture: Separation of concerns - generating, analyzing, and updating are distinct phases

Examples

Simple Q&A Agent

python examples/simple_ace_example.py

Advanced Examples with Different LLMs

python examples/quickstart_litellm.py

LangChain Integration (Advanced)

# With routing and load balancing
python examples/langchain_example.py

Save and Load Playbooks

# Save a trained playbook for later use
adapter = OfflineAdapter(...)
results = adapter.run(samples, environment, epochs=3)
adapter.playbook.save_to_file("my_trained_playbook.json")

# Load a pre-trained playbook
from ace import Playbook, OnlineAdapter
playbook = Playbook.load_from_file("my_trained_playbook.json")
adapter = OnlineAdapter(playbook=playbook, ...)

Check out examples/playbook_persistence.py for a complete example!

Explore more in the examples/ folder!

Supported LLM Providers

ACE works with 100+ LLM providers through LiteLLM:

  • OpenAI - GPT-4, GPT-3.5-turbo
  • Anthropic - Claude 3 (Opus, Sonnet, Haiku)
  • Google - Gemini Pro, PaLM
  • Cohere - Command models
  • Local Models - Ollama, Transformers
  • And many more!

Just change the model name:

# OpenAI
client = LiteLLMClient(model="gpt-4")

# Anthropic Claude
client = LiteLLMClient(model="claude-3-sonnet-20240229")

# Google Gemini
client = LiteLLMClient(model="gemini-pro")

# With fallbacks for reliability
client = LiteLLMClient(
    model="gpt-4",
    fallbacks=["claude-3-haiku", "gpt-3.5-turbo"]
)

Key Features

  • โœ… Self-Improving - Agents learn from experience and build knowledge
  • โœ… Provider Agnostic - Switch LLMs with one line of code
  • โœ… Production Ready - Automatic retries, fallbacks, and error handling
  • โœ… Cost Efficient - Track costs and use cheaper models as fallbacks
  • โœ… Async Support - Built for high-performance applications
  • โœ… Fully Typed - Great IDE support and type safety

Advanced Usage

Online Learning (Learn While Running)

from ace import OnlineAdapter

# Agent improves while processing real tasks
adapter = OnlineAdapter(
    playbook=existing_playbook,  # Can start with existing knowledge
    generator=Generator(client),
    reflector=Reflector(client),
    curator=Curator(client)
)

# Process tasks one by one, learning from each
for task in real_world_tasks:
    result = adapter.process(task, environment)
    # Agent automatically updates its strategies

Custom Task Environments

class CodeTestingEnv(TaskEnvironment):
    def evaluate(self, sample, output):
        # Run the generated code
        test_passed = run_tests(output.final_answer)

        return EnvironmentResult(
            feedback=f"Tests {'passed' if test_passed else 'failed'}",
            ground_truth=sample.ground_truth,
            metrics={"pass_rate": 1.0 if test_passed else 0.0}
        )

Streaming Responses

# Get responses token by token
for chunk in client.complete_with_stream("Write a story"):
    print(chunk, end="", flush=True)

Async Operations

import asyncio

async def main():
    response = await client.acomplete("Solve this problem...")
    print(response.text)

asyncio.run(main())

Experimental v2 Prompts (Beta)

We've developed enhanced v2 prompts that provide better performance through state-of-the-art prompt engineering. These are experimental and in active development.

What's New in v2

  • ๐ŸŽฏ Confidence Scoring: Know when the AI is certain vs uncertain
  • ๐Ÿ“ Enhanced Reasoning: More detailed step-by-step explanations
  • ๐Ÿ”ง Domain Optimization: Specialized prompts for math and code
  • โœ… Better Structure: Based on analysis of 80+ production AI systems

Quick Start with v2

from ace.prompts_v2 import PromptManager

# Use v2 prompts (experimental)
manager = PromptManager(default_version="2.0")

# Create components with v2 prompts
generator = Generator(llm, prompt_template=manager.get_generator_prompt())
reflector = Reflector(llm, prompt_template=manager.get_reflector_prompt())
curator = Curator(llm, prompt_template=manager.get_curator_prompt())

# Or use domain-specific variants
math_generator = Generator(llm, prompt_template=manager.get_generator_prompt(domain="math"))
code_generator = Generator(llm, prompt_template=manager.get_generator_prompt(domain="code"))

v1 vs v2 Comparison

Feature v1 (Default) v2 (Experimental)
Token Usage Baseline +30-50% more
Confidence Scoring โŒ โœ… Tracks uncertainty
Reasoning Detail Basic Enhanced with steps
Domain Variants โŒ โœ… Math, Code optimized
Output Validation Basic Strict JSON schemas
Status Stable ๐Ÿ”ฌ Beta/Experimental

When to Use v2

  • โœ… Use v2 if you need: Confidence scores, detailed reasoning, domain-specific optimization
  • โš ๏ธ Consider v1 if: Token cost is critical, you need maximum stability
  • ๐Ÿ”ฌ Note: v2 is experimental and actively evolving based on user feedback

Examples

# Compare v1 vs v2 performance
python examples/compare_v1_v2_prompts.py

# See v2 features in action
python examples/advanced_prompts_v2.py

See PROMPT_ENGINEERING.md for detailed documentation on v2 prompts.

Architecture

ACE implements the Agentic Context Engineering method from the research paper:

  • Playbook: A structured memory that stores successful strategies
  • Bullets: Individual strategies with helpful/harmful counters
  • Delta Operations: Incremental updates that preserve knowledge
  • Three Roles: Generator, Reflector, and Curator working together

The framework prevents "context collapse" - a common problem where agents forget important information over time.

Repository Structure

ace/
โ”œโ”€โ”€ ace/                    # Core library
โ”‚   โ”œโ”€โ”€ playbook.py        # Strategy storage & persistence
โ”‚   โ”œโ”€โ”€ roles.py           # Generator, Reflector, Curator
โ”‚   โ”œโ”€โ”€ adaptation.py      # Training loops
โ”‚   โ””โ”€โ”€ llm_providers/     # LLM integrations
โ”œโ”€โ”€ examples/              # Ready-to-run examples
โ””โ”€โ”€ tests/                 # Unit tests

Contributing

We welcome contributions! Feel free to:

  • ๐Ÿ› Report bugs
  • ๐Ÿ’ก Suggest features
  • ๐Ÿ”ง Submit PRs
  • ๐Ÿ“š Improve documentation

Citation

If you use ACE in your research or project, please cite the original papers:

ACE Paper (Primary Reference)

@article{zhang2024ace,
  title={Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models},
  author={Zhang, Qizheng and Hu, Changran and Upasani, Shubhangi and Ma, Boyuan and Hong, Fenglu and
          Kamanuru, Vamsidhar and Rainton, Jay and Wu, Chen and Ji, Mengmeng and Li, Hanchen and
          Thakker, Urmish and Zou, James and Olukotun, Kunle},
  journal={arXiv preprint arXiv:2510.04618},
  year={2024}
}

Dynamic Cheatsheet (Foundation Work)

ACE builds upon the adaptive memory concepts from Dynamic Cheatsheet:

@article{suzgun2025dynamiccheatsheet,
  title={Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory},
  author={Suzgun, Mirac and Yuksekgonul, Mert and Bianchi, Federico and Jurafsky, Dan and Zou, James},
  year={2025},
  eprint={2504.07952},
  archivePrefix={arXiv},
  primaryClass={cs.LG},
  url={https://arxiv.org/abs/2504.07952}
}

This Implementation

If you use this specific implementation, you can also reference:

This repository: https://github.com/Kayba-ai/agentic-context-engine
PyPI package: https://pypi.org/project/ace-framework/
Based on the open reproduction at: https://github.com/sci-m-wang/ACE-open

License

MIT License - see LICENSE file for details.

Troubleshooting

Installation Issues

Problem: pip install ace-framework fails

  • Solution: Ensure Python 3.9+ is installed: python --version
  • Solution: Upgrade pip: pip install --upgrade pip

Problem: ImportError when using LangChain integration

  • Solution: Install with extras: pip install ace-framework[langchain]

Problem: LiteLLM API errors

  • Solution: Check your API keys are set correctly in .env
  • Solution: Verify your API key has sufficient credits/quota

Common Errors

"Unrecognized request argument": The LLM provider doesn't support a parameter

  • This is usually handled automatically, but if it persists, please report an issue

Memory issues with large playbooks:

  • Use online adaptation mode instead of offline
  • Periodically save and reload playbooks

Rate limiting errors:

  • Add delays between requests
  • Use exponential backoff (built into LiteLLM)

For more help, check the Issues page.


Note: This is an independent implementation based on the ACE paper (arXiv:2510.04618) and builds upon concepts from Dynamic Cheatsheet. For the original reproduction scaffold, see sci-m-wang/ACE-open.

Made with โค๏ธ by Kayba and the open-source community

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ace_framework-0.2.0.tar.gz (54.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ace_framework-0.2.0-py3-none-any.whl (38.0 kB view details)

Uploaded Python 3

File details

Details for the file ace_framework-0.2.0.tar.gz.

File metadata

  • Download URL: ace_framework-0.2.0.tar.gz
  • Upload date:
  • Size: 54.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ace_framework-0.2.0.tar.gz
Algorithm Hash digest
SHA256 a1f3d12e9b1039e7112a9174c316db485fce4827c7e5ae732604623644e21e3f
MD5 13874e4d7f7ae1f3fdca7c9f7faf1222
BLAKE2b-256 6669d95254c4025ea104138b099fbadf67121cf8914d6d965264be50ab79a730

See more details on using hashes here.

Provenance

The following attestation bundles were made for ace_framework-0.2.0.tar.gz:

Publisher: publish.yml on kayba-ai/agentic-context-engine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ace_framework-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: ace_framework-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 38.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ace_framework-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 047106f1e32f6ad7220dd64eaa2c3ab9fe53f0196e763e3d6eb2630318502da6
MD5 73b63ae333c667221e1d34adb031a0ee
BLAKE2b-256 c0558c04fe7badd217fd81203dd38ad0706955636c3f7030f33ce9da560c7de0

See more details on using hashes here.

Provenance

The following attestation bundles were made for ace_framework-0.2.0-py3-none-any.whl:

Publisher: publish.yml on kayba-ai/agentic-context-engine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page