Skip to main content

The intelligence layer for AI agents. Self-learning loops that make your agents smarter over time.

Project description

AgentLoops

Your agents have memory. Now give them a brain.

PyPI Downloads Stars License


The Missing Layer

Every AI agent stack has memory. None of them learn.

Frameworks    Observability    Memory          ???         Evaluation
(LangChain,   (LangSmith,     (Mem0, Letta)              (Braintrust,
 CrewAI)       Arize)          $140M+ funded               RAGAS)
    │              │               │             │              │
    ▼              ▼               ▼             ▼              ▼
 ┌──────┐    ┌──────────┐    ┌─────────┐   ┌─────────┐   ┌──────────┐
 │Build │───▶│ Observe  │───▶│Remember │───▶│ LEARN  │───▶│Evaluate │
 │agents│    │  runs    │    │  facts  │   │patterns │   │ quality │
 └──────┘    └──────────┘    └─────────┘   └─────────┘   └──────────┘
                                                 ▲
                                                 │
                                          AgentLoops fills
                                            this gap

Memory stores what happened. AgentLoops extracts why it worked and feeds that back into your agents automatically.

See It Work (30 seconds, no API key needed)

pip install agentloops
python -c "import urllib.request; exec(urllib.request.urlopen('https://raw.githubusercontent.com/mhollweck/agentloops/main/examples/quickstart/main.py').read())"
Output: watch your agent go from 62% → 100%
  ╔══════════════════════════════════════════════════════════╗
  ║     AgentLoops — Watch Your Agent Learn in Real Time    ║
  ╚══════════════════════════════════════════════════════════╝

  PHASE 1: Agent runs without learning
  ──────────────────────────────────────────────────────
    ✓ meeting_booked       │ VP Eng at Stripe — personalized technical email
    ✓ replied              │ CTO at startup — case study approach
    ✗ no_reply             │ Director at Shopify — listicle subject
    ✓ meeting_booked       │ VP Eng at Datadog — question about their product
    ✓ meeting_booked       │ Head of Eng at SaaS — congratulated on launch
    ✗ unsubscribed         │ CIO at Wells Fargo — listicle subject
    ✓ meeting_booked       │ VP Eng at Notion — observed their API latency
    ✗ no_reply             │ CTO at tiny startup — generic advice

  Success rate: 62%       Active rules: 0

  LEARNING: Extracting rules from performance data...
  ──────────────────────────────────────────────────────
    [92%] IF prospect is VP Engineering THEN lead with technical observation
    [85%] IF subject is listicle style THEN avoid for enterprise
    [75%] IF prospect had a recent public event THEN reference it

  PHASE 2: Agent runs WITH learned rules
  ──────────────────────────────────────────────────────
    ✓ meeting_booked       │ VP Eng at Cloudflare — observed Workers API latency
    ✓ meeting_booked       │ VP Eng at Twilio — noted Voice API incidents
    ✓ replied              │ CTO at 30-person SaaS — ROI case study
    ✓ meeting_booked       │ VP Eng at Figma — praised real-time collab speed

  Success rate: 100%

  QUALITY GATE: Checking output before sending...
    Personalized email: PASS ✓ (score: 1.0)
    Listicle email:     FAIL ✗ ⚠ Output uses listicle pattern which rule says to avoid

  ╔══════════════════════════════════════════════════════════╗
  ║  RESULT: 62% → 100% success rate                       ║
  ║  Your agent just learned from its own performance.      ║
  ╚══════════════════════════════════════════════════════════╝

Quick Start

pip install agentloops
from agentloops import AgentLoops

# Uses ANTHROPIC_API_KEY env var for reflection
loops = AgentLoops("sales-outreach", agent_type="sales-sdr")

# Track every agent run — learning happens automatically
loops.track(input=task, output=result, outcome="meeting_booked")

# Inject learned rules into your prompt
enhanced_prompt = loops.enhance_prompt(base_prompt)

That's it. Two methods. Your agent now learns from every run.

When you pass agent_type="sales-sdr", AgentLoops loads pre-seeded IF/THEN rules for that agent type -- so your agent starts smart on day one instead of learning from scratch. 10 agent types available out of the box (sales, support, content, code, recruiting, legal, and more).

Learning triggers automatically after enough outcomes. You can also call reflect(), evolve(), and forget() manually for fine-grained control.

Multi-LLM Support

AgentLoops works with Anthropic (default), OpenAI, or any custom LLM:

# OpenAI
loops = AgentLoops("my-agent", llm_provider="openai", api_key="sk-...")

# Custom LLM (local Ollama, Groq, Mistral, etc.)
loops = AgentLoops("my-agent", llm_provider="custom", llm_fn=my_llm_callable)

Collective Intelligence — Your Agent Starts Smart

Shipping now: Pre-seeded starter rules for 10 agent types. Coming soon: Live cross-customer intelligence network.

Every agent on AgentLoops learns from its own runs. The vision: aggregate anonymized learnings across ALL agents of the same type into a global intelligence pool. Today, your agent starts with curated starter rules for its type. Soon, it'll inherit live proven rules from every agent on the platform.

More customers → More outcome data → Better global rules
  → New customers start smarter → Better results → More customers

This is the Waze model. The free map is great. The live traffic data is what makes it indispensable.

What's available now: 10 agent types with curated starter rules (sales-sdr, customer-support, content-creator, and 7 more). Your agent starts smart on day 1 instead of learning from scratch.

Meta-Learning — The Learning Engine Learns Too

AgentLoops doesn't just improve your agent's behavior -- it improves the quality of its own learning over time. The meta-learner tracks which reflections produce impactful rules, which rule formats (evidence-backed vs not, "avoid" vs "do", confidence levels) correlate with positive outcomes, and generates meta-rules that get injected into future reflection prompts. The result: your agent's learning gets sharper with every cycle, not just its behavior.

# Access meta-learning insights
impacts = loops.meta_learner.get_rule_impacts()
patterns = loops.meta_learner.get_best_rule_patterns()
meta_rules = loops.meta_learner.get_meta_rules()

When collective intelligence is active, meta-learnings are shared too -- new customers don't just get starter rules, they get starter learning strategies.

What's coming (the network): Every user contributes anonymized learnings. You pay for freshness and depth:

Tier Price Intelligence
Free $0 3 agent types, manual learning triggers, curated starter rules
Pro $99/mo Unlimited agent types, auto learning, live global rules from network
Team $249/mo Shared namespace across org's agents, team analytics
Enterprise Contact us Live rules + benchmarking + custom filters + dedicated support

No other tool does this. Mem0 stores facts. Letta learns inside their platform. AgentLoops learns across the entire ecosystem.

Before vs After

Without AgentLoops -- your agent makes the same mistakes forever:

# Day 1: Agent sends cold email, gets ignored
# Day 30: Agent sends the same cold email, gets ignored
# Day 90: Agent sends the same cold email, gets ignored
# You manually rewrite the prompt. Again.

With AgentLoops -- your agent evolves:

# Day 1: Agent sends cold email, gets ignored
# Day 2-9: Agent keeps tracking outcomes...
# Day 10: Auto-reflection triggers → "Emails without personalization get 0% reply rate"
# Day 11: enhance_prompt() injects: "IF cold outreach THEN personalize first line"
# Day 50: Auto-evolution → Convention: "Always reference prospect's recent work"
# Day 51: Agent books meetings. You never touched the prompt.

The 7 Mechanisms

AgentLoops implements seven learning mechanisms, inspired by Reflexion, cognitive memory architectures, and months of production use.

# Mechanism What it does When it runs
1 Self-Reflection Agent evaluates its own output, writes patterns to conventions After every run
2 Spike Detection Detects performance anomalies, triggers follow-up Continuous
3 Quality Gate Pre-flight validation via loops.check() — built-in + rule-based + custom checks Before output
4 Decision Rules Extracts IF/THEN rules from performance data Weekly
5 Cross-Evaluation Compares predictions vs actual outcomes Weekly
6 Contradiction Resolution Detects and resolves conflicting learned rules Weekly
7 Selective Forgetting Prunes stale patterns that no longer apply Daily

These aren't theoretical. They've been running in production across 7 agents processing real data for months.

Multi-Outcome System

Not every agent has a simple pass/fail outcome. AgentLoops supports rich outcome definitions so learning works for any metric:

from agentloops import AgentLoops, OutcomeConfig, MetricDef

# Binary (default) — success or failure
loops = AgentLoops("my-agent", outcome=OutcomeConfig.binary())

# Categorical — multiple outcome values
loops = AgentLoops("my-agent", outcome=OutcomeConfig.categorical(["booked", "replied", "ignored"]))

# Numeric — scored outcomes with a goal direction
loops = AgentLoops("my-agent", outcome=OutcomeConfig.numeric(goal="minimize"))

# Multi-metric — weighted composite scoring
loops = AgentLoops("my-agent", outcome=OutcomeConfig(metrics=[
    MetricDef("booking_rate", "categorical", weight=3.0, success_values=["booked"]),
    MetricDef("latency", "duration", weight=1.0, target_value=500),
]))

# Score a run with multiple metrics
score = loops.outcome.score({"booking_rate": "booked", "latency": 320})

The outcome config tells the reflection and rule engines what "good" looks like, so they generate rules that actually optimize for your goals.

Quality Gates

Validate agent output before it reaches users:

result = loops.check(output=agent_response, input=user_query)

if result.passed:
    deliver(agent_response)
else:
    print(result.failures)  # ["Output contains hallucination markers", "Violates rule: IF pricing question THEN include disclaimer"]
    regenerate()

Built-in checks catch empty outputs, hallucination markers, and length violations. Rule-based checks validate output against learned "avoid" rules. You can also pass custom check functions. See the API Reference for full configuration.

How It Works

                    ┌─────────────────┐
                    │   Your Agent    │
                    └────────┬────────┘
                             │
                    loops.track(input, output, outcome)
                             │
                             ▼
              ┌──────────────────────────────┐
              │        AgentLoops Core       │
              │                              │
              │  ┌────────┐  ┌────────────┐  │
              │  │Reflect │  │Quality Gate│  │
              │  └───┬────┘  └─────┬──────┘  │
              │      │             │         │
              │      ▼             ▼         │
              │  ┌─────────────────────┐     │
              │  │    Conventions DB   │     │
              │  │  (IF/THEN rules,   │     │
              │  │   patterns, spikes) │     │
              │  └─────────┬───────────┘     │
              │            │                 │
              │  ┌─────────▼───────────┐     │
              │  │ Evolve / Forget /   │     │
              │  │ Resolve Conflicts   │     │
              │  └─────────────────────┘     │
              └──────────────┬───────────────┘
                             │
                    loops.enhance_prompt(base)
                             │
                             ▼
                    ┌─────────────────┐
                    │  Enhanced Agent  │
                    │  (with learned   │
                    │   conventions)   │
                    └─────────────────┘

Comparison

Feature AgentLoops Mem0 Letta DIY
Memory storage -- Yes Yes Yes
Self-reflection Yes -- -- Manual
Automatic rule extraction Yes -- -- Manual
Spike detection Yes -- -- Manual
Contradiction resolution Yes -- -- No
Selective forgetting Yes -- Partial No
Prompt enhancement Yes -- -- Manual
Convention evolution Yes -- -- No
Framework-agnostic Yes Yes No Yes
Lines of code to add ~5 ~10 ~50 ~500
Focus Learning Storage Stateful agents --

AgentLoops is not a replacement for memory systems. It's the layer that sits on top of them and actually learns.

MCP Server — No Code Needed

Don't write Python? Use AgentLoops via MCP with any compatible agent:

pip install agentloops[mcp]
{
  "mcpServers": {
    "agentloops": {
      "command": "python",
      "args": ["-m", "agentloops_mcp"],
      "env": { "ANTHROPIC_API_KEY": "sk-ant-..." }
    }
  }
}

Your agent gets 7 tools: recall, remember, reflect, get_rules, check, enhance_prompt, list_agent_types. Same learning engine, zero code. See MCP docs for details.

Framework Agnostic

Works with any agent framework. Or no framework at all.

# With LangChain — drop-in callback handler
from agentloops.adapters.langchain import AgentLoopsCallback

handler = AgentLoopsCallback(loops, outcome_fn=lambda run: "success" if run.success else "failure")
result = chain.invoke(prompt, config={"callbacks": [handler]})
# Automatically tracks chain runs, errors, and outcomes

# With CrewAI — callback for tasks and crews
from agentloops.adapters.crewai import AgentLoopsCrewCallback

callback = AgentLoopsCrewCallback(loops, outcome_fn=lambda task: task.output.quality_score)
crew = Crew(agents=[agent], tasks=[task], callbacks=[callback])
# Automatically tracks task completions and crew results

# With raw OpenAI/Anthropic calls
response = client.chat.completions.create(
    messages=[{"role": "system", "content": loops.enhance_prompt(system_prompt)}]
)
loops.track(input=user_msg, output=response, outcome=metric)

Use Cases

  • Sales agents that learn which outreach patterns book meetings
  • Support agents that learn which responses resolve tickets faster
  • Help desk agents that learn guest preferences, upsell timing, and escalation patterns (hotels, airlines, SaaS)
  • Content agents that learn which formats drive engagement
  • Coding agents that learn which patterns produce fewer bugs
  • Research agents that learn which sources yield better insights

If your agent runs more than once, it should be learning.

Why Self-Learning Matters (Not Just Memory)

Memory systems (Mem0, Letta, Zep) store facts: "this user prefers window seats" or "last order was a latte." That's recall — the agent remembers what happened.

Learning is fundamentally different. Learning means the agent changes its behavior based on outcomes:

Memory Learning
Hotel help desk "Guest in 412 asked for extra towels last time" "Guests who book suites AND request late checkout convert 3x on spa upsells — offer proactively"
Sales outreach "Last email to this prospect was June 3" "CTOs at Series B companies respond 4x more to technical deep-dives than ROI pitches"
Support tickets "Customer had billing issue last month" "Billing tickets mentioning 'cancel' resolve 60% faster when you lead with empathy + immediate credit"

Memory gives you a notebook. Learning gives you judgment.

Without learning, your agent makes the same mistakes on run #1,000 as run #1. It remembers more facts but never gets smarter. It's the difference between a new hire who takes great notes and a senior employee who has developed intuition from thousands of reps.

AgentLoops doesn't replace memory — it sits on top of it. Memory stores what happened. AgentLoops learns what to do about it.

Documentation

Community

Contributing

Contributions welcome. See CONTRIBUTING.md for guidelines.

License

MIT


Built by Maria Hollweck at Asobi Labs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentloops-0.1.1.tar.gz (153.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentloops-0.1.1-py3-none-any.whl (54.1 kB view details)

Uploaded Python 3

File details

Details for the file agentloops-0.1.1.tar.gz.

File metadata

  • Download URL: agentloops-0.1.1.tar.gz
  • Upload date:
  • Size: 153.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for agentloops-0.1.1.tar.gz
Algorithm Hash digest
SHA256 e6e2f6427549c576459642e62d39b27662fb81bf9786ca2fa6376891b40f6cda
MD5 348b84db54b554c41ec3a8ed5d00f239
BLAKE2b-256 1bfe603d92cde48fc2d09b0906dc68d1f48a52ed3c1badf711e700febe3416e5

See more details on using hashes here.

File details

Details for the file agentloops-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: agentloops-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 54.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for agentloops-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d2a0e6956769c5f2160a4edc0d1c85c9455d97f9d072ea2a00f538696e9fa726
MD5 cc58a8b501ed8ae7ce27fdff91a2bed0
BLAKE2b-256 3fbc412be998fb2d30dfccd3d95e3f9a03c12d087dac265cc2b9374d9433a8d2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page