A local-first self-improvement kernel for agents. Turns traces into tested memory so agents improve without fine-tuning.

These details have not been verified by PyPI

Project links

Project description

library_name: purpose-agent license: mit language:

en tags:
agents
self-improving
multi-agent
memory-system
local-first
slm
safety
event-driven
rag
tools pipeline_tag: text-generation

🧠 Purpose Agent

The framework where AI agents actually learn from experience.

Local-first · Self-improving · Domain-agnostic · Production-hardened

pip install purpose-agent

🎯 What Problem Does This Solve?

Every other agent framework (LangChain, CrewAI, AutoGen) runs the same way every time. Your agent fails at a task? Next time, it fails the exact same way. No learning. No memory. No improvement.

Purpose Agent is different. After every task:

┌─────────────────────────────────────────────────────────────┐
│                                                             │
│   Task → Execute → Score → Extract Lessons → Remember      │
│     ↑                                           │           │
│     └───── Next task uses lessons ──────────────┘           │
│                                                             │
│   Run 1: Agent struggles ──────── Φ = 3.0                  │
│   Run 2: Uses learned heuristics ─ Φ = 7.0                 │
│   Run 3: Refined further ──────── Φ = 9.5                  │
│                                                             │
└─────────────────────────────────────────────────────────────┘

No fine-tuning. No GPU training. Just memory + experience.

⚡ 3-Line Quickstart

import purpose_agent as pa

team = pa.purpose("Help me write Python code")
result = team.run("Write a fibonacci function")

That's it. The framework auto-detects your model, builds the right team, executes the task, scores the result, and stores lessons for next time.

🏗️ Architecture at a Glance

╔══════════════════════════════════════════════════════════════════╗
║                     PURPOSE AGENT v3.0                          ║
╠══════════════════════════════════════════════════════════════════╣
║                                                                  ║
║  ┌──────────┐    ┌─────────────┐    ┌──────────────────┐       ║
║  │  YOU      │───▶│  EASY API   │───▶│  ORCHESTRATOR    │       ║
║  │ (purpose) │    │ (auto-team) │    │ (step loop)      │       ║
║  └──────────┘    └─────────────┘    └────────┬─────────┘       ║
║                                               │                  ║
║              ┌────────────────────────────────┼──────────┐      ║
║              │                                ▼          │      ║
║              │    ┌──────────┐    ┌──────────────────┐   │      ║
║              │    │  ACTOR   │───▶│  ENVIRONMENT     │   │      ║
║              │    │ (decide) │    │  (execute)       │   │      ║
║              │    └──────────┘    └────────┬─────────┘   │      ║
║              │                             │             │      ║
║              │         ┌───────────────────▼─────┐      │      ║
║              │         │  PURPOSE FUNCTION (Φ)   │      │      ║
║              │         │  Score: 0 ──────── 10   │      │      ║
║              │         │  O(1) state-delta mode  │      │      ║
║              │         └───────────────────┬─────┘      │      ║
║              │                             │             │      ║
║              │    ┌────────────────────────▼─────────┐  │      ║
║              │    │  MEMORY (immune-scanned)          │  │      ║
║              │    │  7 types · 5 statuses · scoped   │  │      ║
║              │    │  quarantine → test → promote      │  │      ║
║              │    └──────────────────────────────────┘  │      ║
║              │                                          │      ║
║              └──── SELF-IMPROVEMENT LOOP ───────────────┘      ║
║                                                                  ║
╚══════════════════════════════════════════════════════════════════╝

🎨 Three Ways to Use It

🟢 Level 1 — Just Describe What You Want

import purpose_agent as pa

# Auto-detects the right team composition
team = pa.purpose("Write Python code and test it")   # → architect + coder + tester
team = pa.purpose("Research quantum computing")       # → researcher + analyst
team = pa.purpose("Analyze sales data")              # → analyst + reporter
team = pa.purpose("Write a blog post")               # → writer + editor

result = team.run("Create a sorting algorithm")
team.teach("Always handle edge cases")    # Inject knowledge directly
print(team.status())                       # See what it's learned

🟡 Level 2 — Choose Your Model & Add Knowledge

import purpose_agent as pa

# 10+ providers supported
team = pa.purpose("Code helper", model="ollama:qwen3:1.7b")          # Local, free
team = pa.purpose("Code helper", model="openrouter:meta-llama/llama-3.3-70b-instruct")
team = pa.purpose("Code helper", model="groq:llama-3.3-70b-versatile")
team = pa.purpose("Code helper", model="openai:gpt-4o")

# Add your own documents as knowledge
team = pa.purpose("Answer questions about our product",
    knowledge="./docs/",              # Load entire folder
    model="qwen3:1.7b",
)
answer = team.ask("What's our refund policy?")

🔴 Level 3 — Full Control

import purpose_agent as pa

# ── Spark: single intelligent agent ──
spark = pa.Spark("coder", model="ollama:qwen3:1.7b", tools=[pa.PythonExecTool()])
result = spark.run("Write fibonacci")

# ── Flow: workflow with conditional routing ──
flow = pa.Flow()
flow.add_node("research", pa.Spark("researcher"))
flow.add_node("write", pa.Spark("writer"))
flow.add_edge(pa.BEGIN, "research")
flow.add_conditional_edge("write", check_fn, {"pass": pa.DONE_SIGNAL, "revise": "research"})
result = flow.run(state)

# ── swarm: parallel execution ──
results = pa.swarm(["task_a", "task_b", "task_c"], agents=[a1, a2, a3])

# ── Council: multi-agent deliberation ──
council = pa.Council([pa.Spark("alice"), pa.Spark("bob"), pa.Spark("carol")])
result = council.run("Should we use microservices?", rounds=3)

# ── Vault: knowledge RAG ──
vault = pa.Vault.from_directory("./research_papers/")
agent = pa.Spark("analyst", tools=[vault.as_tool()])

# ── Generate entire systems ──
from purpose_agent.mas_generator import generate
system = generate("Monitor GitHub repos for CVEs and alert the team")
# → 4 agents + workflow + tools + eval suite + routing policy

🛡️ Safety & Security

┌─────────────────────────────────────────────┐
│           MEMORY IMMUNE SYSTEM              │
│                                             │
│  candidate ──→ immune scan ──→ quarantine   │
│                    │                │       │
│              ┌─────▼─────┐    ┌────▼────┐  │
│              │  REJECTED  │    │  TEST   │  │
│              │ (5 scans)  │    │ (replay)│  │
│              └────────────┘    └────┬────┘  │
│                                    │       │
│                              ┌─────▼─────┐ │
│                              │ PROMOTED  │ │
│                              │ (active)  │ │
│                              └───────────┘ │
└─────────────────────────────────────────────┘

5 threat scanners: prompt injection, score manipulation, tool misuse, privacy leaks, scope overreach

PEP 578 kernel sandbox: Unbypassable audit hooks at the C-interpreter level. No Docker needed.

Falsification critic: Code is scored by CPU-executed assertions, not LLM hallucinations.

🔬 First-Principles Engineering

Problem	Old Approach	Purpose Agent
Token cost grows O(N²)	Pass full history to critic	O(1) state-delta — only pass what changed
SLMs hallucinate scores	"Rate this 0-10" → guess	Falsification — generate asserts, CPU executes, score = math
Sandbox bypassed via dynamic code	AST analysis (weak)	PEP 578 audit hooks — kernel-level, unbypassable
Heuristics overflow context	Inject all 200 heuristics	MoH cap K=10 — only top heuristics by Q-value
UNKNOWN action crashes	Parse failure → crash	Safe fallback to DONE — never propagates garbage

📦 What's Inside (45+ modules)

🔧 Core Engine

Module	What
`orchestrator.py`	Main step loop with 3 critic modes (standard/delta/falsification)
`actor.py`	ReAct agent with 3-tier memory + heuristic cap
`purpose_function.py`	Φ(s) scorer with 7 anti-gaming rules
`experience_replay.py`	Thread-safe trajectory storage with Q-value retrieval
`optimizer.py`	Trajectory → heuristic distillation

🧬 Self-Improvement

Module	What
`memory.py`	7 memory kinds × 5 statuses, scoped, versioned
`memory_ci.py`	Quarantine → immune scan → test → promote/reject
`memory_homeostasis.py`	Budget enforcement, consolidation, archive
`immune.py`	5 threat scanners for memory safety
`breakthroughs.py`	Self-improving critic, MoH, hindsight relabeling, evolution

⚡ First-Principles

Module	What
`state_delta.py`	O(1) Markovian state-diff for critic
`falsification_critic.py`	Popperian scoring via adversarial assertions
`sandbox_hooks.py`	PEP 578 kernel-level audit hooks
`hardening.py`	Null safety, timeouts, validation, graceful degradation
`sre_patches.py`	5 auto-applied critical vulnerability fixes

🌐 Protocols & Interop

Module	What
`protocols/mcp_bridge.py`	MCP tool server integration
`protocols/a2a.py`	Agent-to-Agent delegation with circuit breaker
`protocols/agui.py`	AG-UI frontend streaming
`protocols/agents_md.py`	AGENTS.md repo-local instructions
`quorum.py`	Consensus/disagreement topology switching

🧠 Intelligence

Module	What
`routing.py`	Smart model selection (local-first, cost-aware)
`mas_generator.py`	Use-case → complete multi-agent system
`skills/schema.py`	Versioned, evolvable, testable skill cards
`skills/ci.py`	Skill testing + rollback + Darwinian selection
`llm_compiler.py`	Parallel tool execution via DAG planning

📈 Optimization

Module	What
`optimization/fingerprint.py`	Capability profiling from traces
`optimization/dataset.py`	Trace → filtered training dataset
`optimization/prompt_pack.py`	Epigenetic prompt optimization
`optimization/shadow_eval.py`	Candidate vs baseline comparison
`optimization/optimizer.py`	Improving/plateau/degrading policy
`optimization/lora_plan.py`	LoRA/distillation dry-run planning

🏗️ Runtime

Module	What
`runtime/events.py`	30 canonical event types
`runtime/event_bus.py`	Async pub/sub with backpressure
`runtime/state.py`	Typed execution state for checkpointing
`runtime/checkpoint.py`	InMemory/JSONL/SQLite durability
`streaming_v3.py`	AG-UI compatible stream adapters

🔌 Supported Providers

from purpose_agent import resolve_backend

resolve_backend("ollama:qwen3:1.7b")                    # Local (free)
resolve_backend("openrouter:meta-llama/llama-3.3-70b-instruct")
resolve_backend("groq:llama-3.3-70b-versatile")
resolve_backend("openai:gpt-4o")
resolve_backend("together:meta-llama/Llama-3.3-70B-Instruct-Turbo")
resolve_backend("fireworks:accounts/fireworks/models/llama-v3p1-70b")
resolve_backend("cerebras:llama-3.3-70b")
resolve_backend("deepseek:deepseek-chat")
resolve_backend("mistral:mistral-large-latest")
resolve_backend("hf:Qwen/Qwen3-32B")

📊 Real-World Test Results

Tested with Llama-3.3-70B and Gemma-4-26B via OpenRouter:

Test	Llama-70B	Gemma-26B
fibonacci (4 unit tests)	✅ 100%	✅ 100%
fizzbuzz (4 unit tests)	✅ 100%	✅ 100%
factorial (3 unit tests)	✅ 100%	✅ 100%
Self-improvement (heuristic growth)	0→18	0→11
Immune system (adversarial)	93% catch	—
Production test (19 checks)	19/19 ✅	—

250+ automated tests. Zero failures required for release.

📚 Research Foundation

Built on 13 published papers. Every module traces back to a specific result.

Paper	Module	Contribution
Ng et al. 1999 (PBRS)	purpose_function	Φ preserves optimal policy
MUSE (2510.08002)	actor, optimizer	3-tier memory hierarchy
REMEMBERER (2306.07929)	experience_replay	Q-value retrieval
Reflexion (2303.11366)	orchestrator	Verbal reinforcement
SPC (2504.19162)	immune	Anti-reward-hacking
Meta-Rewarding (2407.19594)	meta_rewarding	Self-improving critic
DSPy (2310.03714)	prompt_optimizer	Automatic few-shot bootstrap
LLMCompiler (2312.04511)	llm_compiler	Parallel tool DAG
Retroformer (2308.02151)	retroformer	Structured reflection
TinyAgent (2409.00608)	slm_backends	SLM-native patterns
DeepSeek MoE (2401.06066)	breakthroughs	MoH sparse selection
HER (1707.01495)	breakthroughs	Hindsight relabeling
Self-Taught Eval (2408.02666)	self_taught	Synthetic critic training

Full proofs: PURPOSE_LEARNING.md · Research trace: COMPILED_RESEARCH.md

🚀 Install

pip install purpose-agent                    # Core (zero dependencies)
pip install purpose-agent[openai]            # + OpenAI/Groq/OpenRouter
pip install purpose-agent[ollama]            # + Local Ollama
pip install purpose-agent[all]              # Everything

For local models (recommended — free, private):

curl -fsSL https://ollama.ai/install.sh | sh
ollama pull qwen3:1.7b

🖥️ CLI

python -m purpose_agent     # Interactive wizard
purpose-agent               # Same, via entry point

📄 License

MIT — use it for anything.

Built on 13 papers. Zero fine-tuning. Agents that actually improve.

PyPI · Architecture · Formal Proofs · Changelog

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

3.0.1

May 5, 2026

3.0.0

May 3, 2026

2.1.1

May 1, 2026

2.1.0

May 1, 2026

2.0.1

May 1, 2026

2.0.0

Apr 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

purpose_agent-3.0.1.tar.gz (216.3 kB view details)

Uploaded May 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

purpose_agent-3.0.1-py3-none-any.whl (199.0 kB view details)

Uploaded May 5, 2026 Python 3

File details

Details for the file purpose_agent-3.0.1.tar.gz.

File metadata

Download URL: purpose_agent-3.0.1.tar.gz
Upload date: May 5, 2026
Size: 216.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for purpose_agent-3.0.1.tar.gz
Algorithm	Hash digest
SHA256	`09dba784736dddb7975cdcde4df134b08cb017dbae150eb94a7db940426b71be`
MD5	`ceb4baa7889e89dfd03ccc824ddb24df`
BLAKE2b-256	`8f6ec428402e9d533fb7a0deae9fe2b001b8f3add939b10cdf16621ded21ebc3`

See more details on using hashes here.

File details

Details for the file purpose_agent-3.0.1-py3-none-any.whl.

File metadata

Download URL: purpose_agent-3.0.1-py3-none-any.whl
Upload date: May 5, 2026
Size: 199.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for purpose_agent-3.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2db2d3c6f11ed8b97d11e0c82be87be820fb57d886dd73c670e8948c59f62291`
MD5	`b3271ee0892f1407d1d84c7861005b9d`
BLAKE2b-256	`7e3725a4e1b518249344fcc1546888e502f311751262af9baf9b67f40ebbc14d`

See more details on using hashes here.

purpose-agent 3.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

🧠 Purpose Agent

The framework where AI agents actually learn from experience.

🎯 What Problem Does This Solve?

⚡ 3-Line Quickstart

🏗️ Architecture at a Glance

🎨 Three Ways to Use It

🟢 Level 1 — Just Describe What You Want

🟡 Level 2 — Choose Your Model & Add Knowledge

🔴 Level 3 — Full Control

🛡️ Safety & Security

🔬 First-Principles Engineering

📦 What's Inside (45+ modules)

🔌 Supported Providers

📊 Real-World Test Results

📚 Research Foundation

🚀 Install

🖥️ CLI

📄 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes