Skip to main content

Self-evolving agent framework — agents that create their own tools

Project description

ARISE — Adaptive Runtime Improvement through Self-Evolution

PyPI version Python 3.11+ License: MIT Docs

Your agent works great on the tasks you planned for. ARISE handles the ones you didn't.

ARISE is a framework-agnostic middleware that gives LLM agents the ability to create their own tools at runtime. When your agent fails at a task, ARISE detects the capability gap, synthesizes a Python tool, validates it in a sandbox, and promotes it to the active library — no human intervention required.

Documentation | Quick Start | PyPI

pip install arise-ai
from arise import ARISE
from arise.rewards import task_success

arise = ARISE(
    agent_fn=my_agent,           # any (task, tools) -> str function
    reward_fn=task_success,
    model="gpt-4o-mini",         # cheap model for tool synthesis
)

result = arise.run("Fetch all users from the paginated API")
# Agent fails → ARISE synthesizes fetch_all_paginated tool → agent succeeds

What It Looks Like

Episode 1  | FAIL  | reward=0.00 | skills=2   Task: "Fetch paginated users with auth"
Episode 2  | FAIL  | reward=0.00 | skills=2
Episode 3  | FAIL  | reward=0.00 | skills=2

[Evolution triggered — 3 failures on API tasks]
  → Synthesizing 'parse_json_response'... 3/3 tests passed ✓
  → Synthesizing 'fetch_all_paginated'... sandbox fail → refine → 1/1 passed ✓

Episode 4  | OK    | reward=1.00 | skills=4   Agent now has the tools it needs

Key Features

  • Self-evolving tool library — fail → detect gap → synthesize → sandbox test → promote
  • Framework-agnostic — any (task, tools) -> str function, Strands, LangGraph, CrewAI
  • Sandboxed validation — subprocess or Docker, adversarial testing, import restrictions
  • Distributed mode — S3 + SQS for stateless deployments (Lambda, ECS, AgentCore)
  • Skill registry — share evolved tools across projects
  • Version control + rollback — SQLite checkpoints, arise rollback <version>
  • A/B testing — refined skills tested against originals before promotion
  • Web Console — create agents, watch evolution live, inspect evolved code (arise console)
  • Dashboard — terminal TUI and web UI for monitoring

Benchmark Results

Model Condition AcmeCorp (SRE) DataCorp (Data Eng)
Claude Sonnet ARISE 78%
Claude Sonnet No tools 63%
GPT-4o-mini ARISE 57% 92%
GPT-4o-mini No tools 48% 50%

ARISE improves task success by +9–42 percentage points across models and domains. See the full benchmark results.

ARISE Console

A web UI for creating agents, watching evolution live, and inspecting evolved tools:

arise console
# Opens http://localhost:8080
  • Create agents — pick model, set system prompt, choose reward function
  • Live terminal feed — watch episodes and evolution in real-time via WebSocket
  • Skill inspector — syntax-highlighted code, test suite, performance metrics
  • Editable config — change reward function, system prompt, failure threshold on the fly
  • All Skills / Evolution Log — global views across all agents

Documentation

Full documentation at arise-ai.dev:

Examples

Example Description
quickstart_evolution.py Full evolution loop: agent fails → ARISE evolves tool → agent succeeds
quickstart.py Math agent evolves statistics tools
api_agent.py HTTP agent evolves auth + pagination (mock server)
devops_agent.py DevOps agent evolves log analysis tools
strands_agent.py Strands integration with Bedrock
demo/agentcore/ AgentCore deployment with A2A protocol

Install

pip install arise-ai              # core (just pydantic)
pip install arise-ai[aws]         # + boto3 for distributed mode
pip install arise-ai[litellm]     # + litellm for multi-provider LLM
pip install arise-ai[docker]      # + docker sandbox backend
pip install arise-ai[dashboard]   # + rich, fastapi for dashboard
pip install arise-ai[otel]        # + opentelemetry for tracing
pip install arise-ai[all]         # everything

Related Work

ARISE builds on ideas from LATM, VOYAGER, CREATOR, ADAS, and CRAFT. ARISE adds the production layer: framework-agnostic integration, sandboxed validation, adversarial testing, version control, distributed deployment, and A/B testing.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arise_ai-0.2.1-py3-none-any.whl (79.7 kB view details)

Uploaded Python 3

File details

Details for the file arise_ai-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: arise_ai-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 79.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.4

File hashes

Hashes for arise_ai-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2f95dc834ef004c461a69d54afd32466c9333bf785879bc7a3c5c699d85809ee
MD5 e175c1b88ebd8da2caea4f5d717d0858
BLAKE2b-256 9f9f529d29561900342bbcc915a4eea288cd7277458cde33488d8cd1b5dfcaef

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page