A local-first self-improvement kernel for agents. Turns traces into tested memory so agents improve without fine-tuning.

These details have not been verified by PyPI

Project links

Project description

library_name: purpose-agent license: mit language:

en tags:
reinforcement-learning
agents
self-improving
memory-system
multi-agent
slm
local-first
evaluation
safety
immune-system pipeline_tag: text-generation

Purpose Agent

A local-first self-improvement kernel for AI agents.

Agents that learn from experience — without fine-tuning, cloud infrastructure, or vendor lock-in. Tested with real models. Published on PyPI.

pip install purpose-agent

import purpose_agent as pa

team = pa.purpose("Help me write Python code")
result = team.run("Write a fibonacci function")
print(result)

team.teach("Always add type hints")
# Next run uses what it learned

How It Works (30-Second Version)

You give it a purpose. "Help me write Python code."
It builds a team. Architect + Coder + Tester — auto-selected from your description.
It runs the task. The agent writes code. A separate critic (the Purpose Function) scores every step.
It learns. Good patterns are extracted as heuristics. Bad patterns are flagged. Dangerous content is blocked by an immune system.
Next run is better. Heuristics from past runs are injected into the prompt. The agent gets smarter without any weight updates.

Real-World Test Results

Tested with Llama-3.3-70B and Gemma-4-26B via OpenRouter:

Model	fibonacci	fizzbuzz	factorial	Self-Improvement
Llama-3.3-70B	✓ 100%	✓ 100%	✓ 100%	0→3→9→18 heuristics
Gemma-4-26B	✓ 100%	✓ 100%	✓ 100%	0→3→6→11 heuristics

Immune system: 93% adversarial catch rate, 0% false positives.

Test suite: 119 unit tests, all passing. See LAUNCH_READINESS.md.

Install

pip install purpose-agent                    # Core (zero dependencies)
pip install purpose-agent[openai]            # + OpenAI / Groq / OpenRouter
pip install purpose-agent[ollama]            # + Local Ollama
pip install purpose-agent[all]               # Everything

Three Levels of Usage

Level 1 — Describe what you want

import purpose_agent as pa

team = pa.purpose("Write Python code and test it")  # → architect + coder + tester
team = pa.purpose("Research quantum computing")       # → researcher + analyst
team = pa.purpose("Write blog posts about AI")        # → writer + editor

result = team.run("Write a sorting algorithm")
team.teach("Always handle edge cases")
print(team.status())  # See what it's learned

Level 2 — Choose your model

# Local (free, private)
team = pa.purpose("Code helper", model="qwen3:1.7b")

# Cloud
team = pa.purpose("Code helper", model="openrouter:meta-llama/llama-3.3-70b-instruct")
team = pa.purpose("Code helper", model="groq:llama-3.3-70b-versatile")
team = pa.purpose("Code helper", model="openai:gpt-4o")

# Any OpenAI-compatible API
from purpose_agent import resolve_backend
backend = resolve_backend("openrouter:google/gemma-4-26b-a4b-it", api_key="sk-or-...")

Supported providers: OpenRouter, Groq, OpenAI, Ollama, HuggingFace, Together, Fireworks, Cerebras, DeepSeek, Mistral.

Level 3 — Full control

import purpose_agent as pa

# Graph workflows (LangGraph-style)
graph = pa.Graph()
graph.add_node("research", pa.Agent("researcher", model="qwen3:1.7b"))
graph.add_node("write", pa.Agent("writer", model="qwen3:1.7b"))
graph.add_edge(pa.START, "research")
graph.add_edge("research", "write")
graph.add_edge("write", pa.END)
result = graph.run(pa.State(data={"topic": "AI safety"}))

# Parallel execution (CrewAI-style)
results = pa.parallel(["task 1", "task 2", "task 3"], agents=[a1, a2, a3])

# Agent conversations (AutoGen-style)
chat = pa.Conversation([pa.Agent("researcher"), pa.Agent("coder")])
result = chat.run("Design a web scraper", rounds=3)

# Knowledge-aware agents (LlamaIndex-style)
kb = pa.KnowledgeStore.from_directory("./docs")
agent = pa.Agent("assistant", tools=[kb.as_tool()])

# Parallel tool execution (LLMCompiler-style)
compiler = pa.LLMCompiler(planner_llm=backend, tool_registry=registry)
result = compiler.compile_and_execute("Calculate X and search Y simultaneously")

Evidence-Gated Memory

Agents don't just accumulate knowledge blindly. Every new memory goes through a pipeline:

candidate → immune scan → quarantine → replay test → promote (or reject)

Immune scan blocks prompt injection, score manipulation, API key leaks, tool misuse
Quarantine holds memories until they're tested
Promotion happens only after evidence shows the memory helps
Rejection preserves the memory for audit but never exposes it to the agent

Seven memory types: purpose_contract, user_preference, skill_card, episodic_case, failure_pattern, critic_calibration, tool_policy.

Honest Evaluation

Three run modes enforce what the framework can mutate:

from purpose_agent import RunMode

RunMode.LEARNING_TRAIN       # Full read/write — this is where agents learn
RunMode.LEARNING_VALIDATION  # Read + staging — validates before promoting
RunMode.EVAL_TEST            # NO writes — numbers you can trust

Secure Tools

CalculatorTool — AST-validated, no eval() on arbitrary text
PythonExecTool — subprocess with timeout + isolated temp directory
ReadFile/WriteFile — sandboxed to declared root directory

Architecture

See ARCHITECTURE.md for the complete technical documentation.

34 Python modules, ~500KB, organized in layers:

Core Engine  → Actor, Purpose Function, Experience Replay, Optimizer, Orchestrator
V2 Kernel    → Memory, Immune, Trace, Compiler, Memory CI, Eval Port, Benchmark
Research     → Meta-Rewarding, Self-Taught, Prompt Optimizer, LLM Compiler, Retroformer
Breakthroughs→ Self-Improving Critic, MoH, Hindsight Relabeling, Heuristic Evolution
Capabilities → Agent, Graph, Parallel, Conversation, KnowledgeStore
Easy API     → purpose(), Team, quickstart wizard

Literature

Built on 13 published papers. Full research trace: COMPILED_RESEARCH.md. Formal proofs: PURPOSE_LEARNING.md.

Paper	What it contributes
MUSE	3-tier memory hierarchy
LATS	LLM-as-value-function
REMEMBERER	Q-value experience replay
Reflexion	Verbal reinforcement
SPC	Anti-reward-hacking
CER	Experience distillation
MemRL	Two-phase retrieval
TinyAgent	SLM-native patterns
Meta-Rewarding	Self-improving critic
Self-Taught Eval	Synthetic critic training
DSPy	Automatic prompt optimization
LLMCompiler	Parallel function calling
Retroformer	Structured reflection

CLI

python -m purpose_agent  # Interactive wizard
purpose-agent            # Same, via entry point

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

3.0.1

May 5, 2026

3.0.0

May 3, 2026

2.1.1

May 1, 2026

2.1.0

May 1, 2026

This version

2.0.1

May 1, 2026

2.0.0

Apr 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

purpose_agent-2.0.1.tar.gz (134.2 kB view details)

Uploaded May 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

purpose_agent-2.0.1-py3-none-any.whl (129.3 kB view details)

Uploaded May 1, 2026 Python 3

File details

Details for the file purpose_agent-2.0.1.tar.gz.

File metadata

Download URL: purpose_agent-2.0.1.tar.gz
Upload date: May 1, 2026
Size: 134.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for purpose_agent-2.0.1.tar.gz
Algorithm	Hash digest
SHA256	`46db3ac24973ec99331d404367135ca2017c8f2d85d150b35a02e9ff24e38959`
MD5	`53621941844795746c54219760a7cab2`
BLAKE2b-256	`0feb3463ffa81e3869330a8573278e82a651622455628398b36da82162ed01d9`

See more details on using hashes here.

File details

Details for the file purpose_agent-2.0.1-py3-none-any.whl.

File metadata

Download URL: purpose_agent-2.0.1-py3-none-any.whl
Upload date: May 1, 2026
Size: 129.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for purpose_agent-2.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a8fe0e3f511d011f3d7139acf51d1c75eb35d8cfd6f92c5de2b18fdb6b78d889`
MD5	`9ceafec06ac2425dac264113e6dfae4a`
BLAKE2b-256	`c321fefcdfff7cab02cfc983324f09efbd147511931f8b2787d1fa1fd312072c`

See more details on using hashes here.

purpose-agent 2.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Purpose Agent

How It Works (30-Second Version)

Real-World Test Results

Install

Three Levels of Usage

Level 1 — Describe what you want

Level 2 — Choose your model

Level 3 — Full control

Evidence-Gated Memory

Honest Evaluation

Secure Tools

Architecture

Literature

CLI

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes