Skip to main content

Self-evolving AI agents via LoRA โ€” Just talk to your agent, it learns.

Project description

๐ŸฆŽ EvoClaw

Just talk to your agent โ€” it learns and EVOLVES.

No GPU Required Fully Async Skill Evolution

EvoClaw turns live conversations into continuous training data โ€” automatically.
Works with any OpenAI-compatible API. Uses free Groq for PRM scoring. Trains with Tinker cloud LoRA.


๐Ÿ”ฅ What is EvoClaw?

EvoClaw wraps your existing AI agent behind an OpenAI-compatible proxy. Every conversation is:

  1. Scored by a PRM (Process Reward Model) via Groq
  2. Skills extracted from high-quality responses and stored
  3. Skills injected into future prompts (immediate improvement, no retraining needed)
  4. Failed turns trigger automatic skill evolution via LLM
  5. All turns feed Tinker LoRA training (GRPO or OPD)

After every batch_size samples, updated weights are saved to Tinker โ€” no service interruption.


๐Ÿš€ Quick Start

pip install evoclaw

evoclaw init   # enter your Groq + Tinker API keys
evoclaw start  # proxy starts on localhost:8080

Then point your existing OpenAI client at EvoClaw:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="any-string",  # Not checked by proxy
)

# Just use it normally โ€” EvoClaw learns in the background
response = client.chat.completions.create(
    model="llama-3.3-70b-versatile",
    messages=[{"role": "user", "content": "Explain impermanent loss"}]
)

That's it. Start chatting. EvoClaw learns automatically.


๐Ÿค– Key Features

Skill Injection

At every turn, the most relevant learned skills are injected into the system prompt.
Immediate behavior improvement โ€” no waiting for retraining.

Skill Evolution

When the agent fails (low PRM score), EvoClaw uses an LLM to generate a new skill
that would have prevented the failure. Over time, the skill bank grows smarter.

Tinker LoRA Training

All conversations feed into online LoRA training via Tinker.
No GPU required. Updated weights are hot-swapped with no downtime.

Two Learning Modes

  • GRPO: Reinforcement learning from implicit conversation rewards
  • OPD: On-policy distillation from high-quality responses

Works with Any Provider

Unlike MetaClaw (OpenClaw + Kimi-2.5 only), EvoClaw works with:

  • Groq (free, recommended)
  • OpenAI
  • Anthropic
  • Any OpenAI-compatible endpoint

โš™๏ธ Configuration

All settings in EvoClawConfig:

Field Default Description
model_name Qwen/Qwen3-4B Tinker base model
lora_rank 32 LoRA rank
batch_size 32 Samples before train step
loss_fn importance_sampling grpo / opd / cross_entropy
use_prm True PRM scoring
prm_threshold 0.65 Min score to learn from
use_skills True Skill injection
enable_skill_evolution True Auto-generate skills from failures
proxy_port 8080 Proxy listen port

๐Ÿ’ช Skill Packs

Pre-built skills for common domains:

config = EvoClawConfig(
    skill_packs=["general", "coding", "crypto", "defi", "security", "agentic"]
)

๐Ÿ”„ Training Loop Example

python examples/run_conversation_rl.py           # GRPO mode
python examples/run_conversation_rl.py --mode opd  # OPD mode
python examples/run_conversation_rl.py --no-train  # Skill injection only

Train from your own conversation file:

evoclaw train --file conversations.jsonl
# Format: {"user": "...", "assistant": "..."}

๐Ÿ“Š Monitor Progress

evoclaw status        # Skills + trainer status
evoclaw skills        # List all learned skills  
evoclaw skills --category crypto  # Filter by category

๐Ÿ—๏ธ Architecture

User/Agent
    โ”‚
    โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  EvoClaw Proxy (localhost:8080) โ”‚
โ”‚  - Inject skills into prompt    โ”‚
โ”‚  - Forward to upstream API      โ”‚
โ”‚  - Score response async (Groq)  โ”‚
โ”‚  - Evolve skills on failure     โ”‚
โ”‚  - Feed samples to Tinker       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
    โ”‚              โ”‚
    โ–ผ              โ–ผ
Groq API     Tinker LoRA
(responses)  (training)

๐Ÿ“„ License

MIT

Acknowledgements

Built on top of MetaClaw, Tinker, and Groq.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

evoclaw-0.2.0.tar.gz (25.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

evoclaw-0.2.0-py3-none-any.whl (28.0 kB view details)

Uploaded Python 3

File details

Details for the file evoclaw-0.2.0.tar.gz.

File metadata

  • Download URL: evoclaw-0.2.0.tar.gz
  • Upload date:
  • Size: 25.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for evoclaw-0.2.0.tar.gz
Algorithm Hash digest
SHA256 efe4d21eea4d46c7313e3a8c18f73394cab02b29de557787c143a48a894063cc
MD5 052e6e57bfef70574cf8bf26766753e8
BLAKE2b-256 147d174acf000e802d9c4b593fb90b1caac8508a6f1311e26ef8fda94ad3a023

See more details on using hashes here.

File details

Details for the file evoclaw-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: evoclaw-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 28.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for evoclaw-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4159875b5735cd60bcbc8194041dbab3ebfe9d8f26bd9a0762de00f45245280f
MD5 c05125fc0dfba5a4c555391c6391e32d
BLAKE2b-256 f2f90f7044ae877b3cbb7833ca33529732b6ece63811f5d7aa29b4bca4a74e11

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page