Skip to main content

A local-first self-improvement kernel for agents. Turns traces into tested memory so agents improve without fine-tuning.

Project description


library_name: purpose-agent license: mit language:

  • en tags:
  • reinforcement-learning
  • agents
  • self-improving
  • memory-system
  • multi-agent
  • slm
  • local-first
  • evaluation
  • safety
  • immune-system pipeline_tag: text-generation

Purpose Agent

A local-first self-improvement kernel for AI agents.

Agents that learn from experience — without fine-tuning, cloud infrastructure, or vendor lock-in. Tested with real models. Published on PyPI.

pip install purpose-agent
import purpose_agent as pa

team = pa.purpose("Help me write Python code")
result = team.run("Write a fibonacci function")
print(result)

team.teach("Always add type hints")
# Next run uses what it learned

How It Works (30-Second Version)

  1. You give it a purpose. "Help me write Python code."
  2. It builds a team. Architect + Coder + Tester — auto-selected from your description.
  3. It runs the task. The agent writes code. A separate critic (the Purpose Function) scores every step.
  4. It learns. Good patterns are extracted as heuristics. Bad patterns are flagged. Dangerous content is blocked by an immune system.
  5. Next run is better. Heuristics from past runs are injected into the prompt. The agent gets smarter without any weight updates.

Real-World Test Results

Tested with Llama-3.3-70B and Gemma-4-26B via OpenRouter:

Model fibonacci fizzbuzz factorial Self-Improvement
Llama-3.3-70B ✓ 100% ✓ 100% ✓ 100% 0→3→9→18 heuristics
Gemma-4-26B ✓ 100% ✓ 100% ✓ 100% 0→3→6→11 heuristics

0-day production test: 19/19 pass on Llama-3.3-70B across all 3 usage levels. Immune system: 93% adversarial catch rate, 0% false positives. Test suite: 119 unit tests, all passing. See LAUNCH_READINESS.md.

Install

pip install purpose-agent                    # Core (zero dependencies)
pip install purpose-agent[openai]            # + OpenAI / Groq / OpenRouter
pip install purpose-agent[ollama]            # + Local Ollama
pip install purpose-agent[all]               # Everything

Three Levels of Usage

Level 1 — Describe what you want

import purpose_agent as pa

team = pa.purpose("Write Python code and test it")  # → architect + coder + tester
team = pa.purpose("Research quantum computing")       # → researcher + analyst
team = pa.purpose("Write blog posts about AI")        # → writer + editor

result = team.run("Write a sorting algorithm")
team.teach("Always handle edge cases")
print(team.status())  # See what it's learned

Level 2 — Choose your model

# Local (free, private)
team = pa.purpose("Code helper", model="qwen3:1.7b")

# Cloud providers
team = pa.purpose("Code helper", model="openrouter:meta-llama/llama-3.3-70b-instruct")
team = pa.purpose("Code helper", model="groq:llama-3.3-70b-versatile")
team = pa.purpose("Code helper", model="openai:gpt-4o")

# Any OpenAI-compatible API
from purpose_agent import resolve_backend
backend = resolve_backend("openrouter:google/gemma-4-26b-a4b-it", api_key="sk-or-...")

Supported providers: OpenRouter, Groq, OpenAI, Ollama, HuggingFace, Together, Fireworks, Cerebras, DeepSeek, Mistral.

Level 3 — Full control

Purpose Agent has its own API vocabulary — original names, not borrowed from other frameworks.

import purpose_agent as pa

# ── Spark: a single intelligent agent ──
spark = pa.Spark("coder", model="openrouter:meta-llama/llama-3.3-70b-instruct")
result = spark.run("Write a fibonacci function")

# ── Flow: workflow engine with conditional routing ──
flow = pa.Flow()
flow.add_node("research", pa.Spark("researcher", model="qwen3:1.7b"))
flow.add_node("write", pa.Spark("writer", model="qwen3:1.7b"))
flow.add_edge(pa.BEGIN, "research")
flow.add_edge("research", "write")
flow.add_conditional_edge("write", review_fn, {"pass": pa.DONE_SIGNAL, "retry": "research"})
result = flow.run(initial_state)

# ── swarm: run tasks concurrently ──
results = pa.swarm(["task 1", "task 2", "task 3"], agents=[spark_a, spark_b, spark_c])

# ── Council: agents deliberate together ──
council = pa.Council([pa.Spark("researcher"), pa.Spark("coder"), pa.Spark("reviewer")])
result = council.run("Design a web scraper", rounds=3)

# ── Vault: knowledge store with RAG-as-a-tool ──
vault = pa.Vault.from_directory("./docs")
spark = pa.Spark("assistant", tools=[vault.as_tool()])
result = spark.run("What does the documentation say about X?")

# ── LLMCompiler: parallel tool execution via DAG planning ──
compiler = pa.LLMCompiler(planner_llm=backend, tool_registry=registry)
result = compiler.compile_and_execute("Calculate X and search Y simultaneously")

API Reference (Level 3)

Name What Example
pa.Spark(name, model, tools) Create an intelligent agent pa.Spark("coder", model="qwen3:1.7b")
pa.Flow() Workflow engine with nodes and edges flow.add_node("step", handler)
pa.swarm(tasks, agents) Run tasks concurrently pa.swarm(["a","b"], [s1, s2])
pa.Council(agents) Agent deliberation rounds council.run("topic", rounds=3)
pa.Vault.from_texts(list) Knowledge store for RAG vault.query("search term")
pa.BEGIN Flow start node flow.add_edge(pa.BEGIN, "first")
pa.DONE_SIGNAL Flow end node flow.add_edge("last", pa.DONE_SIGNAL)

Evidence-Gated Memory

Agents don't just accumulate knowledge blindly. Every new memory goes through a pipeline:

candidate → immune scan → quarantine → replay test → promote (or reject)
  • Immune scan blocks prompt injection, score manipulation, API key leaks, tool misuse
  • Quarantine holds memories until they're tested
  • Promotion happens only after evidence shows the memory helps
  • Rejection preserves the memory for audit but never exposes it to the agent

Seven memory types: purpose_contract, user_preference, skill_card, episodic_case, failure_pattern, critic_calibration, tool_policy.

Honest Evaluation

from purpose_agent import RunMode

RunMode.LEARNING_TRAIN       # Full read/write — this is where agents learn
RunMode.LEARNING_VALIDATION  # Read + staging — validates before promoting
RunMode.EVAL_TEST            # NO writes — numbers you can trust

Secure Tools

  • CalculatorTool — AST-validated, no eval() on arbitrary text
  • PythonExecTool — subprocess with timeout + isolated temp directory
  • ReadFile/WriteFile — sandboxed to declared root directory

Architecture

See ARCHITECTURE.md for the complete technical documentation.

34 Python modules, ~500KB:

Core Engine   → Actor, Purpose Function, Experience Replay, Optimizer, Orchestrator
V2 Kernel     → Memory, Immune, Trace, Compiler, Memory CI, Eval Port, Benchmark
Research      → Meta-Rewarding, Self-Taught, Prompt Optimizer, LLM Compiler, Retroformer
Breakthroughs → Self-Improving Critic, MoH, Hindsight Relabeling, Heuristic Evolution
Capabilities  → Spark, Flow, swarm, Council, Vault
Easy API      → purpose(), Team, quickstart wizard

Literature

Built on 13 published papers. Full research trace: COMPILED_RESEARCH.md. Formal proofs: PURPOSE_LEARNING.md.

Paper What it contributes
MUSE 3-tier memory hierarchy
LATS LLM-as-value-function
REMEMBERER Q-value experience replay
Reflexion Verbal reinforcement
SPC Anti-reward-hacking
CER Experience distillation
MemRL Two-phase retrieval
TinyAgent SLM-native patterns
Meta-Rewarding Self-improving critic
Self-Taught Eval Synthetic critic training
DSPy Automatic prompt optimization
LLMCompiler Parallel function calling
Retroformer Structured reflection

CLI

python -m purpose_agent  # Interactive wizard
purpose-agent            # Same, via entry point

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

purpose_agent-3.0.0.tar.gz (207.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

purpose_agent-3.0.0-py3-none-any.whl (197.2 kB view details)

Uploaded Python 3

File details

Details for the file purpose_agent-3.0.0.tar.gz.

File metadata

  • Download URL: purpose_agent-3.0.0.tar.gz
  • Upload date:
  • Size: 207.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for purpose_agent-3.0.0.tar.gz
Algorithm Hash digest
SHA256 64e4a210a2a68248875b875b5f6cc4912424a21e475426a52b92bd393332e39f
MD5 e6a5766a63c8be21faa72800bb0659ee
BLAKE2b-256 2a9e31f0abe7a7fc05b374047da34f90decef748ec9b6763912725d0b052dea5

See more details on using hashes here.

File details

Details for the file purpose_agent-3.0.0-py3-none-any.whl.

File metadata

  • Download URL: purpose_agent-3.0.0-py3-none-any.whl
  • Upload date:
  • Size: 197.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for purpose_agent-3.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 88f5611433ff7de436d27396134d5fffa9173c5f2fc62a8098d807a4d33cb111
MD5 97d2be9c96b1f50f39b8c81fb7e8d3ae
BLAKE2b-256 1967b9f934b1079bf48e31a3c8057ea6e9795b619b3962e3f19ef5d70c4de2c2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page