A local-first self-improvement kernel for agents. Turns traces into tested memory so agents improve without fine-tuning.
Project description
library_name: purpose-agent license: mit language:
- en tags:
- reinforcement-learning
- agents
- self-improving
- memory-system
- multi-agent
- slm
- local-first
- evaluation
- safety
- immune-system pipeline_tag: text-generation
Purpose Agent
A local-first self-improvement kernel for AI agents.
Agents that learn from experience — without fine-tuning, cloud infrastructure, or vendor lock-in. Tested with real models. Published on PyPI.
pip install purpose-agent
import purpose_agent as pa
team = pa.purpose("Help me write Python code")
result = team.run("Write a fibonacci function")
print(result)
team.teach("Always add type hints")
# Next run uses what it learned
How It Works (30-Second Version)
- You give it a purpose. "Help me write Python code."
- It builds a team. Architect + Coder + Tester — auto-selected from your description.
- It runs the task. The agent writes code. A separate critic (the Purpose Function) scores every step.
- It learns. Good patterns are extracted as heuristics. Bad patterns are flagged. Dangerous content is blocked by an immune system.
- Next run is better. Heuristics from past runs are injected into the prompt. The agent gets smarter without any weight updates.
Real-World Test Results
Tested with Llama-3.3-70B and Gemma-4-26B via OpenRouter:
| Model | fibonacci | fizzbuzz | factorial | Self-Improvement |
|---|---|---|---|---|
| Llama-3.3-70B | ✓ 100% | ✓ 100% | ✓ 100% | 0→3→9→18 heuristics |
| Gemma-4-26B | ✓ 100% | ✓ 100% | ✓ 100% | 0→3→6→11 heuristics |
0-day production test: 19/19 pass on Llama-3.3-70B across all 3 usage levels. Immune system: 93% adversarial catch rate, 0% false positives. Test suite: 119 unit tests, all passing. See LAUNCH_READINESS.md.
Install
pip install purpose-agent # Core (zero dependencies)
pip install purpose-agent[openai] # + OpenAI / Groq / OpenRouter
pip install purpose-agent[ollama] # + Local Ollama
pip install purpose-agent[all] # Everything
Three Levels of Usage
Level 1 — Describe what you want
import purpose_agent as pa
team = pa.purpose("Write Python code and test it") # → architect + coder + tester
team = pa.purpose("Research quantum computing") # → researcher + analyst
team = pa.purpose("Write blog posts about AI") # → writer + editor
result = team.run("Write a sorting algorithm")
team.teach("Always handle edge cases")
print(team.status()) # See what it's learned
Level 2 — Choose your model
# Local (free, private)
team = pa.purpose("Code helper", model="qwen3:1.7b")
# Cloud providers
team = pa.purpose("Code helper", model="openrouter:meta-llama/llama-3.3-70b-instruct")
team = pa.purpose("Code helper", model="groq:llama-3.3-70b-versatile")
team = pa.purpose("Code helper", model="openai:gpt-4o")
# Any OpenAI-compatible API
from purpose_agent import resolve_backend
backend = resolve_backend("openrouter:google/gemma-4-26b-a4b-it", api_key="sk-or-...")
Supported providers: OpenRouter, Groq, OpenAI, Ollama, HuggingFace, Together, Fireworks, Cerebras, DeepSeek, Mistral.
Level 3 — Full control
Purpose Agent has its own API vocabulary — original names, not borrowed from other frameworks.
import purpose_agent as pa
# ── Spark: a single intelligent agent ──
spark = pa.Spark("coder", model="openrouter:meta-llama/llama-3.3-70b-instruct")
result = spark.run("Write a fibonacci function")
# ── Flow: workflow engine with conditional routing ──
flow = pa.Flow()
flow.add_node("research", pa.Spark("researcher", model="qwen3:1.7b"))
flow.add_node("write", pa.Spark("writer", model="qwen3:1.7b"))
flow.add_edge(pa.BEGIN, "research")
flow.add_edge("research", "write")
flow.add_conditional_edge("write", review_fn, {"pass": pa.DONE_SIGNAL, "retry": "research"})
result = flow.run(initial_state)
# ── swarm: run tasks concurrently ──
results = pa.swarm(["task 1", "task 2", "task 3"], agents=[spark_a, spark_b, spark_c])
# ── Council: agents deliberate together ──
council = pa.Council([pa.Spark("researcher"), pa.Spark("coder"), pa.Spark("reviewer")])
result = council.run("Design a web scraper", rounds=3)
# ── Vault: knowledge store with RAG-as-a-tool ──
vault = pa.Vault.from_directory("./docs")
spark = pa.Spark("assistant", tools=[vault.as_tool()])
result = spark.run("What does the documentation say about X?")
# ── LLMCompiler: parallel tool execution via DAG planning ──
compiler = pa.LLMCompiler(planner_llm=backend, tool_registry=registry)
result = compiler.compile_and_execute("Calculate X and search Y simultaneously")
API Reference (Level 3)
| Name | What | Example |
|---|---|---|
pa.Spark(name, model, tools) |
Create an intelligent agent | pa.Spark("coder", model="qwen3:1.7b") |
pa.Flow() |
Workflow engine with nodes and edges | flow.add_node("step", handler) |
pa.swarm(tasks, agents) |
Run tasks concurrently | pa.swarm(["a","b"], [s1, s2]) |
pa.Council(agents) |
Agent deliberation rounds | council.run("topic", rounds=3) |
pa.Vault.from_texts(list) |
Knowledge store for RAG | vault.query("search term") |
pa.BEGIN |
Flow start node | flow.add_edge(pa.BEGIN, "first") |
pa.DONE_SIGNAL |
Flow end node | flow.add_edge("last", pa.DONE_SIGNAL) |
Evidence-Gated Memory
Agents don't just accumulate knowledge blindly. Every new memory goes through a pipeline:
candidate → immune scan → quarantine → replay test → promote (or reject)
- Immune scan blocks prompt injection, score manipulation, API key leaks, tool misuse
- Quarantine holds memories until they're tested
- Promotion happens only after evidence shows the memory helps
- Rejection preserves the memory for audit but never exposes it to the agent
Seven memory types: purpose_contract, user_preference, skill_card, episodic_case, failure_pattern, critic_calibration, tool_policy.
Honest Evaluation
from purpose_agent import RunMode
RunMode.LEARNING_TRAIN # Full read/write — this is where agents learn
RunMode.LEARNING_VALIDATION # Read + staging — validates before promoting
RunMode.EVAL_TEST # NO writes — numbers you can trust
Secure Tools
- CalculatorTool — AST-validated, no
eval()on arbitrary text - PythonExecTool — subprocess with timeout + isolated temp directory
- ReadFile/WriteFile — sandboxed to declared root directory
Architecture
See ARCHITECTURE.md for the complete technical documentation.
34 Python modules, ~500KB:
Core Engine → Actor, Purpose Function, Experience Replay, Optimizer, Orchestrator
V2 Kernel → Memory, Immune, Trace, Compiler, Memory CI, Eval Port, Benchmark
Research → Meta-Rewarding, Self-Taught, Prompt Optimizer, LLM Compiler, Retroformer
Breakthroughs → Self-Improving Critic, MoH, Hindsight Relabeling, Heuristic Evolution
Capabilities → Spark, Flow, swarm, Council, Vault
Easy API → purpose(), Team, quickstart wizard
Literature
Built on 13 published papers. Full research trace: COMPILED_RESEARCH.md. Formal proofs: PURPOSE_LEARNING.md.
| Paper | What it contributes |
|---|---|
| MUSE | 3-tier memory hierarchy |
| LATS | LLM-as-value-function |
| REMEMBERER | Q-value experience replay |
| Reflexion | Verbal reinforcement |
| SPC | Anti-reward-hacking |
| CER | Experience distillation |
| MemRL | Two-phase retrieval |
| TinyAgent | SLM-native patterns |
| Meta-Rewarding | Self-improving critic |
| Self-Taught Eval | Synthetic critic training |
| DSPy | Automatic prompt optimization |
| LLMCompiler | Parallel function calling |
| Retroformer | Structured reflection |
CLI
python -m purpose_agent # Interactive wizard
purpose-agent # Same, via entry point
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file purpose_agent-2.1.1.tar.gz.
File metadata
- Download URL: purpose_agent-2.1.1.tar.gz
- Upload date:
- Size: 135.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4c7f9acaf3ae5a09aab451db7703eb08d4e3e9f8df560879d39675d70b594568
|
|
| MD5 |
0f936812843c1ddc7cb7b704380e842e
|
|
| BLAKE2b-256 |
d67bd430b56df3122ab0ddb437bff3539795f60ec01be3dd28733bdc76fea638
|
File details
Details for the file purpose_agent-2.1.1-py3-none-any.whl.
File metadata
- Download URL: purpose_agent-2.1.1-py3-none-any.whl
- Upload date:
- Size: 130.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a01d3e57c7e6cbc848907ab2b1a07186883372b0e4f3203d922e4f8fa68b477a
|
|
| MD5 |
b0344fbc94244118891aa0ae397e353b
|
|
| BLAKE2b-256 |
1df674d79b75e23ba737c5ce9683576c773185379d1d96173cef25a99cbcd346
|