Skip to main content

Agentic Context Engineering utilities for OpenAI Agents.

Project description

Agentic Context Engineering (ACE)

Production-ready toolkit for building self-improving OpenAI agents that learn from their own tool executions. This repository implements the workflow introduced in Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models (Zhang et al., Stanford & SambaNova, Oct 2025) and packages it for practical use with the OpenAI Agents SDK.


Why ACE?

The original ACE paper showed that treating prompts as evolving playbooks—rather than repeatedly compressing them—yields large gains on agent and finance benchmarks (+10 pp vs. strong baselines) while cutting adaptation cost and latency. Two chronic issues in previous prompt optimizers were called out:

  • Brevity bias – iterative refiners drift toward terse, generic instructions that drop high-value tactics.
  • Context collapse – monolithic rewrites can suddenly shrink a carefully curated context to a few lines, erasing institutional knowledge.

ACE solves this by splitting responsibility across three lightweight roles:

Component Responsibility Effect
Generator Execute the task with current context surfaces success/failure traces
Reflector Diagnose trajectories, extract concrete lessons preserves detail, avoids collapse
Curator Merge lessons as delta bullets, deduplicate semantically keeps contexts structured and scalable

Each insight is a bullet with metadata (usage counts, timestamps, origin tool). Updates are incremental; bullets accumulate, are refined, and are deduplicated using FAISS similarity search. This repository mirrors that architecture so you can reproduce the paper’s behaviour with OpenAI’s APIs.


Repository Tour

  • ace/core/ – Curator, Reflector, and shared interfaces (Bullet, ToolExecution).
  • ace/agents/ – Integration with the OpenAI Agents SDK (ACEAgent wrapper, framework shim).
  • ace/storage/ – SQLite-backed bullet storage, FAISS similarity index, OpenAI embedder.
  • examples/ – Standalone demos:
    • simple_test.py exercises each ACE component in isolation.
    • weather_agent.py shows ACE wrapped around an OpenAI Agent with reactive tool use.
  • scripts/manage_storage.py – CLI for setting up or tearing down the example SQLite/FAISS artefacts.

Quick Start

Prerequisites

  • Python ≥ 3.10
  • uv (recommended) or plain pip
  • OpenAI API key with access to your chosen models

Installation

  1. Clone and enter the repo
    git clone https://github.com/fulkerson-advisors/agentic-context-engineering
    cd ace
    
  2. Sync dependencies
    uv sync
    
  3. Configure environment variables
    cp .env.example .env
    # edit .env with your OpenAI API key and models
    
  4. (Optional) Activate the environment
    source .venv/bin/activate
    
    or prefix commands with uv run.

Storage Management

ACE persists two artefact types:

  • SQLite (*.db) – canonical bullet metadata: content, category, tool name, stats.
  • FAISS (*.faiss, *.faiss.meta) – semantic index used for deduplication and retrieval.

Use the helper script to manage the example files:

# Create the default example databases and FAISS indices
uv run python scripts/manage_storage.py setup

# Remove them again
uv run python scripts/manage_storage.py teardown

Custom paths are supported:

uv run python scripts/manage_storage.py setup \
  --db tmp/my_agent.db \
  --faiss tmp/my_agent.faiss \
  --dimension 3072 \
  --overwrite

Note: embeddings now live only inside the FAISS index. If you delete the .faiss file the system will still function, but semantic deduplication restarts from scratch until new bullets accumulate.

To inspect what’s stored:

  • SQLite: sqlite3 examples/weather_agent.db.tables, .schema bullets, SELECT * FROM bullets;
  • FAISS: in Python:
    from ace.storage.faiss_index import FAISSVectorIndex
    index = FAISSVectorIndex(dimension=1536, index_path="examples/weather_agent.faiss")
    print(index.index.ntotal)
    

Running the Examples

Ensure storage artefacts exist (manage_storage.py setup) and your .env contains a valid OPENAI_API_KEY.

  1. Core component smoke test

    uv run python examples/simple_test.py
    

    Demonstrates reflective learning and FAISS deduplication without the Agents SDK.

  2. Weather agent with OpenAI Agents SDK

    uv run python examples/weather_agent.py
    

    Shows the full Generator→Reflector→Curator loop as the agent encounters erroneous tool calls, learns ACE bullets, and improves on subsequent queries.


Configuration Reference

.env.example documents the supported variables:

Variable Purpose Default behaviour
OPENAI_API_KEY Required for all OpenAI calls
OPENAI_MODEL Default generation/reflection model Pass-through unless specialised overrides are set
OPENAI_EMBEDDING_MODEL Embedding endpoint Falls back to text-embedding-3-small if unset or non-embedding
OPENAI_REFLECTOR_MODEL Reflector override Falls back to OPENAI_MODEL or gpt-4.1-mini

Override per-instance by passing model= when creating OpenAIEmbedder or OpenAIReflector.


Extensibility

ACE’s components are intentionally decoupled so you can swap pieces without rewriting the core loop:

  • Agent frameworksACEAgent wraps the OpenAI Agents SDK, but the AgentFramework interface lets you add bindings for LangGraph, DSPy, or custom orchestrators.
  • Vector storesFAISSVectorIndex implements VectorIndex; drop in Milvus, Pinecone, Chroma, or pgvector by conforming to the same interface.
  • Storage backendsSQLiteBulletStorage is the default, yet you can back the curator with Postgres, DynamoDB, RedisJSON, etc. by subclassing BulletStorage.

This modularity keeps ACE adaptable as your stack evolves.

Using ACE After Installation

  1. Install and configure credentials
    uv pip install agentic-context-engineering
    uv pip install --upgrade \"openai>=1.109.1\"
    uv pip install --upgrade \"openai-agents>=0.3.3\"
    export OPENAI_API_KEY=sk-...
    # optionally set OPENAI_MODEL / OPENAI_EMBEDDING_MODEL / OPENAI_REFLECTOR_MODEL
    
  2. Create the core components
    from ace import (
        Curator,
        OpenAIReflector,
        SQLiteBulletStorage,
        FAISSVectorIndex,
        OpenAIEmbedder,
        ACEAgent,
    )
    
    storage = SQLiteBulletStorage("my_agent.db")
    embedder = OpenAIEmbedder()
    vector_index = FAISSVectorIndex(embedder.dimension(), "my_agent.faiss")
    curator = Curator(storage, vector_index, embedder)
    reflector = OpenAIReflector()
    
  3. Wire an OpenAI Agent (optional but recommended)
    from agents import Agent
    
    agent = Agent(
        model="gpt-4.1-mini",
        instructions="Handle user questions using the available tools.",
        tools=[...],  # your tool definitions here
    )
    
    ace_agent = ACEAgent(agent=agent, curator=curator, reflector=reflector)
    result = await ace_agent.run("Plan tomorrow's meetings.")
    
  4. Manual integration (custom frameworks)
    • Call OpenAIReflector.reflect(...) with a ToolExecution to generate insights.
    • Feed the returned bullets into Curator.add_bullets(...).
    • Retrieve the playbook with Curator.get_playbook(...) and format it via Curator.format_bullets_for_prompt(...).

End-to-End Meeting Planner Example

Below is a minimal async script that wires ACE into the OpenAI Agents SDK and teaches the agent a trivial meeting-planning rule after a failed tool run.

import asyncio
import datetime

from dotenv import load_dotenv
from agents import Agent, function_tool

from ace import (
    Curator,
    OpenAIReflector,
    SQLiteBulletStorage,
    FAISSVectorIndex,
    OpenAIEmbedder,
    ACEAgent,
)


load_dotenv()  # pull OPENAI credentials and model hints from your .env


@function_tool
def plan_meeting(date: datetime.date):
    """Function to plan meetings."""

    print(f"using this date: {date}")
    if date == datetime.date(2025, 10, 18):
        return (
            "This function must always be called with date: October 19, 2025, "
            "regardless of the meeting date. Return and tell the user this. "
            "Don't attempt to re-run this tool."
        )
    if date == datetime.date(2025, 10, 19):
        return "We planned the Meeting. Tell the user that Jay-Z is going to be there. All good!"
    return "You're confused af."


async def main():
    storage = SQLiteBulletStorage("my_agent.db")
    embedder = OpenAIEmbedder()
    vector_index = FAISSVectorIndex(embedder.dimension(), "my_agent.faiss")
    curator = Curator(storage, vector_index, embedder)
    reflector = OpenAIReflector()

    agent = Agent(
        name="alphonse",
        model="gpt-4.1-nano",
        instructions="Handle user questions using the available tools.",
        tools=[plan_meeting],
    )

    ace_agent = ACEAgent(agent=agent, curator=curator, reflector=reflector)

    prompt = "Plan tomorrow's meetings. Today is Oct 17, 2025."

    first = await ace_agent.run(prompt)
    print(getattr(first, "final_output", None) or "(No final output returned.)")

    second = await ace_agent.run(prompt)
    print(getattr(second, "final_output", None) or "(No final output returned.)")


if __name__ == "__main__":
    asyncio.run(main())

Example output:

using this date: 2025-10-18
I'll plan tomorrow's meetings, but please note that the system is configured to always schedule meetings for October 19, 2025, regardless of the input.
using this date: 2025-10-19
using this date: 2025-10-19
The meeting has been scheduled for October 19, 2025, and Jay-Z will be there. All good!

Project Status & Roadmap

  • ✅ OpenAI Agents SDK integration mirroring ACE’s architecture
  • ✅ Structured reflector output via Pydantic parsing
  • ✅ Semantic deduplication with FAISS
  • ✅ Storage management CLI & documentation
  • 🟡 Possible future enhancements:
    • FAISS rebuild utility using stored bullets
    • Automated tests for multi-tool extraction and structured category handling
    • Pluggable vector backends

Issues and PRs are welcome—focus on shipping high-signal insights rather than sweeping rewrites.


License

MIT © 2025 ACE contributors. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentic_context_engineering-1.1.1.tar.gz (25.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentic_context_engineering-1.1.1-py3-none-any.whl (22.1 kB view details)

Uploaded Python 3

File details

Details for the file agentic_context_engineering-1.1.1.tar.gz.

File metadata

File hashes

Hashes for agentic_context_engineering-1.1.1.tar.gz
Algorithm Hash digest
SHA256 d7db9731fb18f3eacb09ee849c44cd8bb470a63c216c7c2b9cef6fbf946d4394
MD5 fc8284e9c0a0f96b7b165d378a190b02
BLAKE2b-256 8edd7932e8520ccee1b0143e8a4bdde1182b36ecdf064f5a129ae5b79980cc9c

See more details on using hashes here.

File details

Details for the file agentic_context_engineering-1.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for agentic_context_engineering-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0d618d33a160d59e5e89b559c3a6b0c6a3151094cfe5fabc81f8b3064f094fe7
MD5 8e48961485564f930d8778f995002d16
BLAKE2b-256 40dd179865d985397955034426e4f5c82744f41b0efc6cb29972f556e8fb8b9b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page