Skip to main content

The intelligence layer for AI agents. Self-learning loops that make your agents smarter over time.

Project description

AgentLoops

Your agents have memory. Now give them a brain.

PyPI Downloads Stars License


The Missing Layer

Every AI agent stack has memory. None of them learn.

Frameworks    Observability    Memory          ???         Evaluation
(LangChain,   (LangSmith,     (Mem0, Letta)              (Braintrust,
 CrewAI)       Arize)          $140M+ funded               RAGAS)
    │              │               │             │              │
    ▼              ▼               ▼             ▼              ▼
 ┌──────┐    ┌──────────┐    ┌─────────┐   ┌─────────┐   ┌──────────┐
 │Build │───▶│ Observe  │───▶│Remember │───▶│ LEARN  │───▶│Evaluate │
 │agents│    │  runs    │    │  facts  │   │patterns │   │ quality │
 └──────┘    └──────────┘    └─────────┘   └─────────┘   └──────────┘
                                                 ▲
                                                 │
                                          AgentLoops fills
                                            this gap

Memory stores what happened. AgentLoops extracts why it worked and feeds that back into your agents automatically.

See It Work (30 seconds, no API key needed)

pip install agentloops
python -c "import urllib.request; exec(urllib.request.urlopen('https://raw.githubusercontent.com/mhollweck/agentloops/main/examples/quickstart/main.py').read())"
Output: watch your agent go from 62% → 100%
  ╔══════════════════════════════════════════════════════════╗
  ║     AgentLoops — Watch Your Agent Learn in Real Time    ║
  ╚══════════════════════════════════════════════════════════╝

  PHASE 1: Agent runs without learning
  ──────────────────────────────────────────────────────
    ✓ meeting_booked       │ VP Eng at Stripe — personalized technical email
    ✓ replied              │ CTO at startup — case study approach
    ✗ no_reply             │ Director at Shopify — listicle subject
    ✓ meeting_booked       │ VP Eng at Datadog — question about their product
    ✓ meeting_booked       │ Head of Eng at SaaS — congratulated on launch
    ✗ unsubscribed         │ CIO at Wells Fargo — listicle subject
    ✓ meeting_booked       │ VP Eng at Notion — observed their API latency
    ✗ no_reply             │ CTO at tiny startup — generic advice

  Success rate: 62%       Active rules: 0

  LEARNING: Extracting rules from performance data...
  ──────────────────────────────────────────────────────
    [92%] IF prospect is VP Engineering THEN lead with technical observation
    [85%] IF subject is listicle style THEN avoid for enterprise
    [75%] IF prospect had a recent public event THEN reference it

  PHASE 2: Agent runs WITH learned rules
  ──────────────────────────────────────────────────────
    ✓ meeting_booked       │ VP Eng at Cloudflare — observed Workers API latency
    ✓ meeting_booked       │ VP Eng at Twilio — noted Voice API incidents
    ✓ replied              │ CTO at 30-person SaaS — ROI case study
    ✓ meeting_booked       │ VP Eng at Figma — praised real-time collab speed

  Success rate: 100%

  QUALITY GATE: Checking output before sending...
    Personalized email: PASS ✓ (score: 1.0)
    Listicle email:     FAIL ✗ ⚠ Output uses listicle pattern which rule says to avoid

  ╔══════════════════════════════════════════════════════════╗
  ║  RESULT: 62% → 100% success rate                       ║
  ║  Your agent just learned from its own performance.      ║
  ╚══════════════════════════════════════════════════════════╝

Quick Start

pip install agentloops

Using an AI coding assistant? Just tell it: "Install agentloops and add self-learning to my agent." It will handle the integration automatically. Works with Claude Code, Cursor, Copilot, and others.

Or add it manually — it's 3 lines:

from agentloops import AgentLoops

# Uses ANTHROPIC_API_KEY env var for reflection
loops = AgentLoops("sales-outreach", agent_type="sales-sdr")

# Track every agent run — learning happens automatically
loops.track(input=task, output=result, outcome="meeting_booked")

# Inject learned rules into your prompt
enhanced_prompt = loops.enhance_prompt(base_prompt)

That's it. Two methods. Your agent now learns from every run.

When you pass agent_type="sales-sdr", AgentLoops loads pre-seeded IF/THEN rules for that agent type -- so your agent starts smart on day one instead of learning from scratch. 10 agent types available out of the box (sales, support, content, code, recruiting, legal, and more).

Learning triggers automatically after enough outcomes. You can also call reflect(), evolve(), and forget() manually for fine-grained control.

Multi-LLM Support

AgentLoops works with Anthropic (default), OpenAI, or any custom LLM:

# OpenAI
loops = AgentLoops("my-agent", llm_provider="openai", api_key="sk-...")

# Custom LLM (local Ollama, Groq, Mistral, etc.)
loops = AgentLoops("my-agent", llm_provider="custom", llm_fn=my_llm_callable)

Collective Intelligence — Your Agent Starts Smart

Every agent on AgentLoops learns from its own runs. But the real power is the network: anonymized rule patterns are contributed to a global intelligence pool, so every agent of the same type makes ALL agents of that type smarter.

More agents → More outcome data → Better global rules
  → New agents start smarter → Better results → More agents

This is the Waze model. The free map is useful. The live traffic data is what makes it indispensable.

What's available now: 10 agent types with curated starter rules (sales-sdr, customer-support, content-creator, and 7 more). Your agent starts smart on day 1 instead of learning from scratch.

Privacy & Data

When you set an agent_type, AgentLoops contributes anonymized rule patterns to the collective network. Here's exactly what happens:

  • Sent: Generalized IF/THEN patterns only (e.g., "IF prospect is VP Engineering THEN lead with technical observation"). Only rules with confidence ≥ 0.6.
  • Sanitized before sending: Company names, URLs, emails, and dollar amounts are stripped. "IF prospect is VP at Stripe" becomes "IF prospect is VP at [ENTITY]".
  • Never sent: Raw inputs, raw outputs, metadata, user data, or anything that could identify you or your customers.
  • Privacy threshold: Rules only enter the global pool after 5+ independent agents discover the same pattern. No single user's rule can leak.
  • Opt out anytime:
# Option 1: per-instance
loops = AgentLoops("my-agent", agent_type="sales-sdr", collective=False)

# Option 2: global
import agentloops.collective
agentloops.collective.opt_out()

# Option 3: environment variable
# AGENTLOOPS_COLLECTIVE_DISABLED=1

This follows the Homebrew model — on by default for the network to work, transparent about what's sent, easy to disable.

Meta-Learning — The Learning Engine Learns Too

AgentLoops doesn't just improve your agent's behavior -- it improves the quality of its own learning over time. The meta-learner tracks which reflections produce impactful rules, which rule formats (evidence-backed vs not, "avoid" vs "do", confidence levels) correlate with positive outcomes, and generates meta-rules that get injected into future reflection prompts. The result: your agent's learning gets sharper with every cycle, not just its behavior.

# Access meta-learning insights
impacts = loops.meta_learner.get_rule_impacts()
patterns = loops.meta_learner.get_best_rule_patterns()
meta_rules = loops.meta_learner.get_meta_rules()

When collective intelligence is active, meta-learnings are shared too -- new customers don't just get starter rules, they get starter learning strategies.

No other tool does this. Mem0 stores facts. Letta learns inside their platform. AgentLoops learns across the entire ecosystem.

Before vs After

Without AgentLoops -- your agent makes the same mistakes forever:

# Day 1: Agent sends cold email, gets ignored
# Day 30: Agent sends the same cold email, gets ignored
# Day 90: Agent sends the same cold email, gets ignored
# You manually rewrite the prompt. Again.

With AgentLoops -- your agent evolves:

# Day 1: Agent sends cold email, gets ignored
# Day 2-9: Agent keeps tracking outcomes...
# Day 10: Auto-reflection triggers → "Emails without personalization get 0% reply rate"
# Day 11: enhance_prompt() injects: "IF cold outreach THEN personalize first line"
# Day 50: Auto-evolution → Convention: "Always reference prospect's recent work"
# Day 51: Agent books meetings. You never touched the prompt.

The 7 Mechanisms

AgentLoops implements seven learning mechanisms, inspired by Reflexion, cognitive memory architectures, and months of production use.

# Mechanism What it does When it runs
1 Self-Reflection Agent evaluates its own output, writes patterns to conventions After every run
2 Spike Detection Detects performance anomalies, triggers follow-up Continuous
3 Quality Gate Pre-flight validation via loops.check() — built-in + rule-based + custom checks Before output
4 Decision Rules Extracts IF/THEN rules from performance data Weekly
5 Cross-Evaluation Compares predictions vs actual outcomes Weekly
6 Contradiction Resolution Detects and resolves conflicting learned rules Weekly
7 Selective Forgetting Prunes stale patterns that no longer apply Daily

These aren't theoretical. They've been running in production across 7 agents processing real data for months.

Multi-Outcome System

Not every agent has a simple pass/fail outcome. AgentLoops supports rich outcome definitions so learning works for any metric:

from agentloops import AgentLoops, OutcomeConfig, MetricDef

# Binary (default) — success or failure
loops = AgentLoops("my-agent", outcome=OutcomeConfig.binary())

# Categorical — multiple outcome values
loops = AgentLoops("my-agent", outcome=OutcomeConfig.categorical(["booked", "replied", "ignored"]))

# Numeric — scored outcomes with a goal direction
loops = AgentLoops("my-agent", outcome=OutcomeConfig.numeric(goal="minimize"))

# Multi-metric — weighted composite scoring
loops = AgentLoops("my-agent", outcome=OutcomeConfig(metrics=[
    MetricDef("booking_rate", "categorical", weight=3.0, success_values=["booked"]),
    MetricDef("latency", "duration", weight=1.0, target_value=500),
]))

# Score a run with multiple metrics
score = loops.outcome.score({"booking_rate": "booked", "latency": 320})

The outcome config tells the reflection and rule engines what "good" looks like, so they generate rules that actually optimize for your goals.

Quality Gates

Validate agent output before it reaches users:

result = loops.check(output=agent_response, input=user_query)

if result.passed:
    deliver(agent_response)
else:
    print(result.failures)  # ["Output contains hallucination markers", "Violates rule: IF pricing question THEN include disclaimer"]
    regenerate()

Built-in checks catch empty outputs, hallucination markers, and length violations. Rule-based checks validate output against learned "avoid" rules. You can also pass custom check functions. See the API Reference for full configuration.

How It Works

                    ┌─────────────────┐
                    │   Your Agent    │
                    └────────┬────────┘
                             │
                    loops.track(input, output, outcome)
                             │
                             ▼
              ┌──────────────────────────────┐
              │        AgentLoops Core       │
              │                              │
              │  ┌────────┐  ┌────────────┐  │
              │  │Reflect │  │Quality Gate│  │
              │  └───┬────┘  └─────┬──────┘  │
              │      │             │         │
              │      ▼             ▼         │
              │  ┌─────────────────────┐     │
              │  │    Conventions DB   │     │
              │  │  (IF/THEN rules,   │     │
              │  │   patterns, spikes) │     │
              │  └─────────┬───────────┘     │
              │            │                 │
              │  ┌─────────▼───────────┐     │
              │  │ Evolve / Forget /   │     │
              │  │ Resolve Conflicts   │     │
              │  └─────────────────────┘     │
              └──────────────┬───────────────┘
                             │
                    loops.enhance_prompt(base)
                             │
                             ▼
                    ┌─────────────────┐
                    │  Enhanced Agent  │
                    │  (with learned   │
                    │   conventions)   │
                    └─────────────────┘

Comparison

Feature AgentLoops Mem0 Letta DIY
Memory storage -- Yes Yes Yes
Self-reflection Yes -- -- Manual
Automatic rule extraction Yes -- -- Manual
Spike detection Yes -- -- Manual
Contradiction resolution Yes -- -- No
Selective forgetting Yes -- Partial No
Prompt enhancement Yes -- -- Manual
Convention evolution Yes -- -- No
Framework-agnostic Yes Yes No Yes
Lines of code to add ~5 ~10 ~50 ~500
Focus Learning Storage Stateful agents --

AgentLoops is not a replacement for memory systems. It's the layer that sits on top of them and actually learns.

MCP Server — No Code Needed

Don't write Python? Use AgentLoops via MCP with any compatible agent:

pip install agentloops[mcp]
{
  "mcpServers": {
    "agentloops": {
      "command": "python",
      "args": ["-m", "agentloops_mcp"],
      "env": { "ANTHROPIC_API_KEY": "sk-ant-..." }
    }
  }
}

Your agent gets 7 tools: recall, remember, reflect, get_rules, check, enhance_prompt, list_agent_types. Same learning engine, zero code. See MCP docs for details.

Framework Agnostic

Works with any agent framework. Or no framework at all.

# With LangChain — drop-in callback handler
from agentloops.adapters.langchain import AgentLoopsCallback

handler = AgentLoopsCallback(loops, outcome_fn=lambda run: "success" if run.success else "failure")
result = chain.invoke(prompt, config={"callbacks": [handler]})
# Automatically tracks chain runs, errors, and outcomes

# With CrewAI — callback for tasks and crews
from agentloops.adapters.crewai import AgentLoopsCrewCallback

callback = AgentLoopsCrewCallback(loops, outcome_fn=lambda task: task.output.quality_score)
crew = Crew(agents=[agent], tasks=[task], callbacks=[callback])
# Automatically tracks task completions and crew results

# With raw OpenAI/Anthropic calls
response = client.chat.completions.create(
    messages=[{"role": "system", "content": loops.enhance_prompt(system_prompt)}]
)
loops.track(input=user_msg, output=response, outcome=metric)

Use Cases

  • Sales agents that learn which outreach patterns book meetings
  • Support agents that learn which responses resolve tickets faster
  • Help desk agents that learn guest preferences, upsell timing, and escalation patterns (hotels, airlines, SaaS)
  • Content agents that learn which formats drive engagement
  • Coding agents that learn which patterns produce fewer bugs
  • Research agents that learn which sources yield better insights

If your agent runs more than once, it should be learning.

Why Self-Learning Matters (Not Just Memory)

Memory systems (Mem0, Letta, Zep) store facts: "this user prefers window seats" or "last order was a latte." That's recall — the agent remembers what happened.

Learning is fundamentally different. Learning means the agent changes its behavior based on outcomes:

Memory Learning
Hotel help desk "Guest in 412 asked for extra towels last time" "Guests who book suites AND request late checkout convert 3x on spa upsells — offer proactively"
Sales outreach "Last email to this prospect was June 3" "CTOs at Series B companies respond 4x more to technical deep-dives than ROI pitches"
Support tickets "Customer had billing issue last month" "Billing tickets mentioning 'cancel' resolve 60% faster when you lead with empathy + immediate credit"

Memory gives you a notebook. Learning gives you judgment.

Without learning, your agent makes the same mistakes on run #1,000 as run #1. It remembers more facts but never gets smarter. It's the difference between a new hire who takes great notes and a senior employee who has developed intuition from thousands of reps.

AgentLoops doesn't replace memory — it sits on top of it. Memory stores what happened. AgentLoops learns what to do about it.

Documentation

Community

Contributing

Contributions welcome. See CONTRIBUTING.md for guidelines.

License

MIT


Built by Maria Hollweck at Asobi Labs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentloops-0.2.0.tar.gz (184.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentloops-0.2.0-py3-none-any.whl (72.0 kB view details)

Uploaded Python 3

File details

Details for the file agentloops-0.2.0.tar.gz.

File metadata

  • Download URL: agentloops-0.2.0.tar.gz
  • Upload date:
  • Size: 184.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agentloops-0.2.0.tar.gz
Algorithm Hash digest
SHA256 fe34f9d3cd524125271217886fc71b135d6f8d66fb578372ad783cef3629dedf
MD5 ce5331a649143244db6e1b1b8634625f
BLAKE2b-256 c4803da4f0266eedc887c47c920539fe1b342c2539087e93128412fbdfe3861f

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentloops-0.2.0.tar.gz:

Publisher: publish.yml on mhollweck/agentloops

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file agentloops-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: agentloops-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 72.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agentloops-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 29d69da5916191148e162909f019b4cc0504dac9c84a35e2e41f4921fa8e6020
MD5 7146336caffbb185aea206c5f7b5f0bb
BLAKE2b-256 41ea37d2df98c76fe413aab8aab6828f7f1471300a50d464706be518589c36dc

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentloops-0.2.0-py3-none-any.whl:

Publisher: publish.yml on mhollweck/agentloops

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page