A local-first self-improvement kernel for agents. Turns traces into tested memory so agents improve without fine-tuning.
Project description
library_name: purpose-agent license: mit language:
- en tags:
- agents
- self-improving
- multi-agent
- memory-system
- local-first
- slm
- safety
- event-driven
- rag
- tools pipeline_tag: text-generation
๐ง Purpose Agent
The framework where AI agents actually learn from experience.
Local-first ยท Self-improving ยท Domain-agnostic ยท Production-hardened
pip install purpose-agent
๐ฏ What Problem Does This Solve?
Every other agent framework (LangChain, CrewAI, AutoGen) runs the same way every time. Your agent fails at a task? Next time, it fails the exact same way. No learning. No memory. No improvement.
Purpose Agent is different. After every task:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โ Task โ Execute โ Score โ Extract Lessons โ Remember โ
โ โ โ โ
โ โโโโโโ Next task uses lessons โโโโโโโโโโโโโโโ โ
โ โ
โ Run 1: Agent struggles โโโโโโโโ ฮฆ = 3.0 โ
โ Run 2: Uses learned heuristics โ ฮฆ = 7.0 โ
โ Run 3: Refined further โโโโโโโโ ฮฆ = 9.5 โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
No fine-tuning. No GPU training. Just memory + experience.
โก 3-Line Quickstart
import purpose_agent as pa
team = pa.purpose("Help me write Python code")
result = team.run("Write a fibonacci function")
That's it. The framework auto-detects your model, builds the right team, executes the task, scores the result, and stores lessons for next time.
๐๏ธ Architecture at a Glance
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ PURPOSE AGENT v3.0 โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฃ
โ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โ
โ โ YOU โโโโโถโ EASY API โโโโโถโ ORCHESTRATOR โ โ
โ โ (purpose) โ โ (auto-team) โ โ (step loop) โ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโฌโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโ โ
โ โ โผ โ โ
โ โ โโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ ACTOR โโโโโถโ ENVIRONMENT โ โ โ
โ โ โ (decide) โ โ (execute) โ โ โ
โ โ โโโโโโโโโโโโ โโโโโโโโโโฌโโโโโโโโโโ โ โ
โ โ โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโผโโโโโโ โ โ
โ โ โ PURPOSE FUNCTION (ฮฆ) โ โ โ
โ โ โ Score: 0 โโโโโโโโ 10 โ โ โ
โ โ โ O(1) state-delta mode โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโฌโโโโโโ โ โ
โ โ โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโ โ โ
โ โ โ MEMORY (immune-scanned) โ โ โ
โ โ โ 7 types ยท 5 statuses ยท scoped โ โ โ
โ โ โ quarantine โ test โ promote โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ โ
โ โโโโโ SELF-IMPROVEMENT LOOP โโโโโโโโโโโโโโโโ โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐จ Three Ways to Use It
๐ข Level 1 โ Just Describe What You Want
import purpose_agent as pa
# Auto-detects the right team composition
team = pa.purpose("Write Python code and test it") # โ architect + coder + tester
team = pa.purpose("Research quantum computing") # โ researcher + analyst
team = pa.purpose("Analyze sales data") # โ analyst + reporter
team = pa.purpose("Write a blog post") # โ writer + editor
result = team.run("Create a sorting algorithm")
team.teach("Always handle edge cases") # Inject knowledge directly
print(team.status()) # See what it's learned
๐ก Level 2 โ Choose Your Model & Add Knowledge
import purpose_agent as pa
# 10+ providers supported
team = pa.purpose("Code helper", model="ollama:qwen3:1.7b") # Local, free
team = pa.purpose("Code helper", model="openrouter:meta-llama/llama-3.3-70b-instruct")
team = pa.purpose("Code helper", model="groq:llama-3.3-70b-versatile")
team = pa.purpose("Code helper", model="openai:gpt-4o")
# Add your own documents as knowledge
team = pa.purpose("Answer questions about our product",
knowledge="./docs/", # Load entire folder
model="qwen3:1.7b",
)
answer = team.ask("What's our refund policy?")
๐ด Level 3 โ Full Control
import purpose_agent as pa
# โโ Spark: single intelligent agent โโ
spark = pa.Spark("coder", model="ollama:qwen3:1.7b", tools=[pa.PythonExecTool()])
result = spark.run("Write fibonacci")
# โโ Flow: workflow with conditional routing โโ
flow = pa.Flow()
flow.add_node("research", pa.Spark("researcher"))
flow.add_node("write", pa.Spark("writer"))
flow.add_edge(pa.BEGIN, "research")
flow.add_conditional_edge("write", check_fn, {"pass": pa.DONE_SIGNAL, "revise": "research"})
result = flow.run(state)
# โโ swarm: parallel execution โโ
results = pa.swarm(["task_a", "task_b", "task_c"], agents=[a1, a2, a3])
# โโ Council: multi-agent deliberation โโ
council = pa.Council([pa.Spark("alice"), pa.Spark("bob"), pa.Spark("carol")])
result = council.run("Should we use microservices?", rounds=3)
# โโ Vault: knowledge RAG โโ
vault = pa.Vault.from_directory("./research_papers/")
agent = pa.Spark("analyst", tools=[vault.as_tool()])
# โโ Generate entire systems โโ
from purpose_agent.mas_generator import generate
system = generate("Monitor GitHub repos for CVEs and alert the team")
# โ 4 agents + workflow + tools + eval suite + routing policy
๐ก๏ธ Safety & Security
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ MEMORY IMMUNE SYSTEM โ
โ โ
โ candidate โโโ immune scan โโโ quarantine โ
โ โ โ โ
โ โโโโโโโผโโโโโโ โโโโโโผโโโโโ โ
โ โ REJECTED โ โ TEST โ โ
โ โ (5 scans) โ โ (replay)โ โ
โ โโโโโโโโโโโโโโ โโโโโโฌโโโโโ โ
โ โ โ
โ โโโโโโโผโโโโโโ โ
โ โ PROMOTED โ โ
โ โ (active) โ โ
โ โโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
5 threat scanners: prompt injection, score manipulation, tool misuse, privacy leaks, scope overreach
PEP 578 kernel sandbox: Unbypassable audit hooks at the C-interpreter level. No Docker needed.
Falsification critic: Code is scored by CPU-executed assertions, not LLM hallucinations.
๐ฌ First-Principles Engineering
| Problem | Old Approach | Purpose Agent |
|---|---|---|
| Token cost grows O(Nยฒ) | Pass full history to critic | O(1) state-delta โ only pass what changed |
| SLMs hallucinate scores | "Rate this 0-10" โ guess | Falsification โ generate asserts, CPU executes, score = math |
| Sandbox bypassed via dynamic code | AST analysis (weak) | PEP 578 audit hooks โ kernel-level, unbypassable |
| Heuristics overflow context | Inject all 200 heuristics | MoH cap K=10 โ only top heuristics by Q-value |
| UNKNOWN action crashes | Parse failure โ crash | Safe fallback to DONE โ never propagates garbage |
๐ฆ What's Inside (45+ modules)
๐ง Core Engine
| Module | What |
|---|---|
orchestrator.py |
Main step loop with 3 critic modes (standard/delta/falsification) |
actor.py |
ReAct agent with 3-tier memory + heuristic cap |
purpose_function.py |
ฮฆ(s) scorer with 7 anti-gaming rules |
experience_replay.py |
Thread-safe trajectory storage with Q-value retrieval |
optimizer.py |
Trajectory โ heuristic distillation |
๐งฌ Self-Improvement
| Module | What |
|---|---|
memory.py |
7 memory kinds ร 5 statuses, scoped, versioned |
memory_ci.py |
Quarantine โ immune scan โ test โ promote/reject |
memory_homeostasis.py |
Budget enforcement, consolidation, archive |
immune.py |
5 threat scanners for memory safety |
breakthroughs.py |
Self-improving critic, MoH, hindsight relabeling, evolution |
โก First-Principles
| Module | What |
|---|---|
state_delta.py |
O(1) Markovian state-diff for critic |
falsification_critic.py |
Popperian scoring via adversarial assertions |
sandbox_hooks.py |
PEP 578 kernel-level audit hooks |
hardening.py |
Null safety, timeouts, validation, graceful degradation |
sre_patches.py |
5 auto-applied critical vulnerability fixes |
๐ Protocols & Interop
| Module | What |
|---|---|
protocols/mcp_bridge.py |
MCP tool server integration |
protocols/a2a.py |
Agent-to-Agent delegation with circuit breaker |
protocols/agui.py |
AG-UI frontend streaming |
protocols/agents_md.py |
AGENTS.md repo-local instructions |
quorum.py |
Consensus/disagreement topology switching |
๐ง Intelligence
| Module | What |
|---|---|
routing.py |
Smart model selection (local-first, cost-aware) |
mas_generator.py |
Use-case โ complete multi-agent system |
skills/schema.py |
Versioned, evolvable, testable skill cards |
skills/ci.py |
Skill testing + rollback + Darwinian selection |
llm_compiler.py |
Parallel tool execution via DAG planning |
๐ Optimization
| Module | What |
|---|---|
optimization/fingerprint.py |
Capability profiling from traces |
optimization/dataset.py |
Trace โ filtered training dataset |
optimization/prompt_pack.py |
Epigenetic prompt optimization |
optimization/shadow_eval.py |
Candidate vs baseline comparison |
optimization/optimizer.py |
Improving/plateau/degrading policy |
optimization/lora_plan.py |
LoRA/distillation dry-run planning |
๐๏ธ Runtime
| Module | What |
|---|---|
runtime/events.py |
30 canonical event types |
runtime/event_bus.py |
Async pub/sub with backpressure |
runtime/state.py |
Typed execution state for checkpointing |
runtime/checkpoint.py |
InMemory/JSONL/SQLite durability |
streaming_v3.py |
AG-UI compatible stream adapters |
๐ Supported Providers
from purpose_agent import resolve_backend
resolve_backend("ollama:qwen3:1.7b") # Local (free)
resolve_backend("openrouter:meta-llama/llama-3.3-70b-instruct")
resolve_backend("groq:llama-3.3-70b-versatile")
resolve_backend("openai:gpt-4o")
resolve_backend("together:meta-llama/Llama-3.3-70B-Instruct-Turbo")
resolve_backend("fireworks:accounts/fireworks/models/llama-v3p1-70b")
resolve_backend("cerebras:llama-3.3-70b")
resolve_backend("deepseek:deepseek-chat")
resolve_backend("mistral:mistral-large-latest")
resolve_backend("hf:Qwen/Qwen3-32B")
๐ Real-World Test Results
Tested with Llama-3.3-70B and Gemma-4-26B via OpenRouter:
| Test | Llama-70B | Gemma-26B |
|---|---|---|
| fibonacci (4 unit tests) | โ 100% | โ 100% |
| fizzbuzz (4 unit tests) | โ 100% | โ 100% |
| factorial (3 unit tests) | โ 100% | โ 100% |
| Self-improvement (heuristic growth) | 0โ18 | 0โ11 |
| Immune system (adversarial) | 93% catch | โ |
| Production test (19 checks) | 19/19 โ | โ |
250+ automated tests. Zero failures required for release.
๐ Research Foundation
Built on 13 published papers. Every module traces back to a specific result.
| Paper | Module | Contribution |
|---|---|---|
| Ng et al. 1999 (PBRS) | purpose_function | ฮฆ preserves optimal policy |
| MUSE (2510.08002) | actor, optimizer | 3-tier memory hierarchy |
| REMEMBERER (2306.07929) | experience_replay | Q-value retrieval |
| Reflexion (2303.11366) | orchestrator | Verbal reinforcement |
| SPC (2504.19162) | immune | Anti-reward-hacking |
| Meta-Rewarding (2407.19594) | meta_rewarding | Self-improving critic |
| DSPy (2310.03714) | prompt_optimizer | Automatic few-shot bootstrap |
| LLMCompiler (2312.04511) | llm_compiler | Parallel tool DAG |
| Retroformer (2308.02151) | retroformer | Structured reflection |
| TinyAgent (2409.00608) | slm_backends | SLM-native patterns |
| DeepSeek MoE (2401.06066) | breakthroughs | MoH sparse selection |
| HER (1707.01495) | breakthroughs | Hindsight relabeling |
| Self-Taught Eval (2408.02666) | self_taught | Synthetic critic training |
Full proofs: PURPOSE_LEARNING.md ยท Research trace: COMPILED_RESEARCH.md
๐ Install
pip install purpose-agent # Core (zero dependencies)
pip install purpose-agent[openai] # + OpenAI/Groq/OpenRouter
pip install purpose-agent[ollama] # + Local Ollama
pip install purpose-agent[all] # Everything
For local models (recommended โ free, private):
curl -fsSL https://ollama.ai/install.sh | sh
ollama pull qwen3:1.7b
๐ฅ๏ธ CLI
python -m purpose_agent # Interactive wizard
purpose-agent # Same, via entry point
๐ License
MIT โ use it for anything.
Built on 13 papers. Zero fine-tuning. Agents that actually improve.
PyPI ยท Architecture ยท Formal Proofs ยท Changelog
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file purpose_agent-3.0.1.tar.gz.
File metadata
- Download URL: purpose_agent-3.0.1.tar.gz
- Upload date:
- Size: 216.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
09dba784736dddb7975cdcde4df134b08cb017dbae150eb94a7db940426b71be
|
|
| MD5 |
ceb4baa7889e89dfd03ccc824ddb24df
|
|
| BLAKE2b-256 |
8f6ec428402e9d533fb7a0deae9fe2b001b8f3add939b10cdf16621ded21ebc3
|
File details
Details for the file purpose_agent-3.0.1-py3-none-any.whl.
File metadata
- Download URL: purpose_agent-3.0.1-py3-none-any.whl
- Upload date:
- Size: 199.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2db2d3c6f11ed8b97d11e0c82be87be820fb57d886dd73c670e8948c59f62291
|
|
| MD5 |
b3271ee0892f1407d1d84c7861005b9d
|
|
| BLAKE2b-256 |
7e3725a4e1b518249344fcc1546888e502f311751262af9baf9b67f40ebbc14d
|