Self-evolving agent framework powered by reinforcement learning, evolutionary algorithms, and self-improving policies

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ariaxhan

These details have not been verified by PyPI

Project description

Armature

AI agents that get better every time they run.

You deploy an AI agent. It works okay. Then you spend weeks tweaking prompts, swapping models, adjusting temperatures — hoping something sticks. When user behavior shifts, you do it all over again.

Armature wraps your agent in a reinforcement learning loop. The system learns what works, drops what doesn't, and adapts in real time. No retraining. No redeployment. Your agent evolves on its own.

The API

Three functions. That's it.

import asyncio
from armature import configure_runtime, runtime_select, runtime_update

async def main():
    # Define the strategies your agent can choose between
    await configure_runtime("my_agent", config={
        "arms": [
            {"arm_id": "concise", "params": {"temperature": 0.3, "max_tokens": 500}},
            {"arm_id": "detailed", "params": {"temperature": 0.7, "max_tokens": 2000}},
            {"arm_id": "creative", "params": {"temperature": 0.9, "max_tokens": 1500}},
        ],
        "storage": {"backend": "sqlite", "path": "armature.db"}
    })

    # System picks the best strategy using Thompson Sampling
    selection = await runtime_select("my_agent", user_id="user_123")

    # Use selection.params in your LLM call
    response = await your_llm_call(**selection.params)

    # Tell the system how it went (0.0 = bad, 1.0 = good)
    await runtime_update(
        "my_agent",
        decision_id=selection.decision_id,
        reward=1.0
    )

asyncio.run(main())

Everything else — learning algorithms, exploration/exploitation tradeoffs, storage, armature tracking — is handled for you.

Use Cases

Research agent optimization

You have an agent that runs parallel research tasks — verifying claims, cross-referencing sources, scoring confidence. Different research strategies work better for different domains. Armature learns that deep-dive verification works for neuroscience claims while rapid cross-referencing is better for psychology replication studies. Your research agent routes to the right strategy automatically.

Customer support routing

Your support bot handles thousands of conversations. Some users want concise answers, others want detailed walkthroughs. Instead of one-size-fits-all, the system learns which response style resolves issues fastest for each question type. Resolution rates improve without anyone touching the prompts.

Multi-model routing

You have access to GPT-4, Claude, Gemini, and open-source models. Each excels at different tasks. Instead of picking one or building manual routing rules, Thompson Sampling learns which model performs best for which task type — and shifts traffic as models update.

Content generation pipelines

You generate reports, summaries, or scripts from data. The system learns which model parameters, prompt structures, and output formats produce content that passes your quality bar. Configurations that work get selected more; configurations that don't get dropped.

Knowledge retrieval

Your agent pulls from multiple data sources — databases, APIs, documents. Different retrieval paths work better for different query types. Armature learns which retrieval strategy actually answers questions accurately, cutting wasted API calls and improving hit rates.

Self-improving classifiers

You classify text, images, or data with an LLM. The system learns which prompt/model combinations produce the most accurate classifications for each category, and evolves toward better accuracy over time without retraining.

How It Works

The core is Thompson Sampling — a Bayesian algorithm for the multi-armed bandit problem:

Your agent has "arms" — different configurations, prompts, models, or strategies
For each request, the system picks an arm — balancing exploration (trying new things) with exploitation (using what works)
You report the outcome — a reward signal from 0 to 1
The system updates its beliefs — Bayesian updates shift probability toward what works

This converges on the best option in 15-30 interactions while never fully abandoning exploration (in case conditions change).

On top of this:

Evolutionary algorithms breed new configurations by combining traits of top performers
Semantic caching recognizes similar requests and reuses results, cutting API costs 70-80%
Confidence extraction detects hedging language and uncertainty in LLM outputs
RLP (Reinforced Learned Policy) teaches the agent to reason before acting, after enough data accumulates
SAO (Self-Alignment Optimization) generates training data from successful episodes for self-improvement

The system starts simple (Thompson Sampling only) and progressively unlocks more sophisticated learning as interaction data accumulates.

Install

pip install armature-ai

Or from source:

git clone https://github.com/ariaxhan/armature-ai.git
cd armature-ai
pip install -e ".[dev]"

Try the examples

cd examples/00_quickstart
python 02_confidence_extraction.py  # No API key needed
python 06_thompson_sampling_loop.py  # Watch Thompson Sampling converge

See the full Cookbook for 40+ runnable examples.

What's Inside

Component	Status	What it does
Thompson Sampling	Production	Bayesian arm selection, converges in 15-30 pulls
Evolutionary Engine	Production	Breeds new configs from top performers
Storage Backends	Production	SQLite, PostgreSQL, Redis, in-memory
Semantic Caching	Production	Embedding-based dedup, 70-80% cost reduction
Confidence Extraction	Production	Detects hedging, certainty, and calibration gaps
Context Graph	Beta	Knowledge graph with relationship traversal
Safety Guardrails	Beta	Framework-level input/output validation
Observability	Beta	Learning curves, drift detection, cost tracking
RLP	Experimental	Think-before-acting policy (needs 500+ interactions)
SAO	Experimental	Self-generated training data (needs 1000+ interactions)

Integrations

Category	Supported
LLM Providers	OpenAI, Anthropic Claude, Google Gemini, Azure OpenAI, Groq, any LiteLLM-compatible provider
Agent Frameworks	Agno, LangGraph, or extend with custom adapters
Storage	SQLite (dev), PostgreSQL (prod), Redis (caching), in-memory
Observability	Weights & Biases Weave

Architecture

┌──────────────────────────────────────────────────┐
│              THE CONVERGENCE                      │
├──────────────────────────────────────────────────┤
│                                                   │
│  OPTIMIZATION       Learning engine               │
│  Thompson Sampling, evolutionary algorithms,      │
│  semantic caching, confidence extraction          │
│                                                   │
│  KNOWLEDGE          Structured context            │
│  Context graph, relationship traversal,           │
│  progressive disclosure, mergeable graphs         │
│                                                   │
│  SAFETY             Framework-enforced            │
│  Input validation, execution control,             │
│  output validation, full audit trail              │
│                                                   │
│  OBSERVABILITY      Built-in metrics              │
│  Learning curves, calibration, cost tracking,     │
│  drift detection                                  │
│                                                   │
│  STORAGE            Pluggable backends            │
│  SQLite / PostgreSQL / Redis / Memory             │
│                                                   │
└──────────────────────────────────────────────────┘

Safety checks run at the framework level, not inside prompts. The model can't be prompted to bypass them.

Research Foundation

Built on peer-reviewed work:

Thompson Sampling — Bayesian exploration/exploitation
RLP — Reinforced Learned Policy (NVIDIA, 2024)
SAO — Self-Alignment Optimization (Hugging Face, 2024)
NeMo Guardrails — Framework-level safety (NVIDIA)

Documentation

Getting Started — Full setup guide
API Reference — Function documentation
Configuration — All config options
Examples — 40+ working examples
Changelog — Release history

Contributing

Contributions welcome. See CONTRIBUTING.md.

We especially welcome: integration examples, new storage backends, safety audit findings, and documentation improvements.

Credits

Armature was originally built at Persistos by Aria Han, Shreyash Hamal, and Myat Pyae Paing. The original work (v0.1.x) included the optimization engine, SDK interface, evaluators, natural language processor, Agno adapters, Convex storage backend, Weave integration, and CLI.

The 1.0 release and all subsequent development is by Aria Han: Thompson Sampling persistence, context graph, knowledge layer, semantic caching, safety guardrails, RLP/SAO plugins, 40+ examples cookbook, and the architecture rewrite from API optimization tool to self-evolving agent framework.

License

MIT — see LICENSE.

Stop tuning. Start evolving.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ariaxhan

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.0.0

Mar 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

armature_ai-1.0.0.tar.gz (391.2 kB view details)

Uploaded Mar 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

armature_ai-1.0.0-py3-none-any.whl (489.4 kB view details)

Uploaded Mar 26, 2026 Python 3

File details

Details for the file armature_ai-1.0.0.tar.gz.

File metadata

Download URL: armature_ai-1.0.0.tar.gz
Upload date: Mar 26, 2026
Size: 391.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for armature_ai-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`c0d8442c1e2e755cd23302b5791c121215d8f6c9c184d89a7e308432b05ee366`
MD5	`0fee763a8bf0182ed7e471cc3cbaade6`
BLAKE2b-256	`cc1665415250b9d69b0e94dae478de43c4cf1575b75ee5b97d383ecfec649f42`

See more details on using hashes here.

Provenance

The following attestation bundles were made for armature_ai-1.0.0.tar.gz:

Publisher: publish.yml on ariaxhan/armature-ai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: armature_ai-1.0.0.tar.gz
- Subject digest: c0d8442c1e2e755cd23302b5791c121215d8f6c9c184d89a7e308432b05ee366
- Sigstore transparency entry: 1183763281
- Sigstore integration time: Mar 26, 2026
Source repository:
- Permalink: ariaxhan/armature-ai@15bd6878dbcae2fcc90b3b4582c5b5914fb4217c
- Branch / Tag: refs/tags/v1.0.0
- Owner: https://github.com/ariaxhan
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@15bd6878dbcae2fcc90b3b4582c5b5914fb4217c
- Trigger Event: release

File details

Details for the file armature_ai-1.0.0-py3-none-any.whl.

File metadata

Download URL: armature_ai-1.0.0-py3-none-any.whl
Upload date: Mar 26, 2026
Size: 489.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for armature_ai-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ba6c9cec559e68ea7835a4f7d92efe1229a8ec97c2b9ed92f8a9e524f8946b54`
MD5	`3e0e6a89c375930ec76266099d18017f`
BLAKE2b-256	`bb8d734877f8d87e64bb80a156d90280538b30b3430241d410adaa85307b7b28`

See more details on using hashes here.

Provenance

The following attestation bundles were made for armature_ai-1.0.0-py3-none-any.whl:

Publisher: publish.yml on ariaxhan/armature-ai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: armature_ai-1.0.0-py3-none-any.whl
- Subject digest: ba6c9cec559e68ea7835a4f7d92efe1229a8ec97c2b9ed92f8a9e524f8946b54
- Sigstore transparency entry: 1183763290
- Sigstore integration time: Mar 26, 2026
Source repository:
- Permalink: ariaxhan/armature-ai@15bd6878dbcae2fcc90b3b4582c5b5914fb4217c
- Branch / Tag: refs/tags/v1.0.0
- Owner: https://github.com/ariaxhan
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@15bd6878dbcae2fcc90b3b4582c5b5914fb4217c
- Trigger Event: release

armature-ai 1.0.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Armature

The API

Use Cases

Research agent optimization

Customer support routing

Multi-model routing

Content generation pipelines

Knowledge retrieval

Self-improving classifiers

How It Works

Install

Try the examples

What's Inside

Integrations

Architecture

Research Foundation

Documentation

Contributing

Credits

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance