Agentic Research and Evaluation Suite

Project description

ARES

ARES (Agentic Research and Evaluation Suite) is an RL-first framework for training and evaluating agents.

Quick Start

Get ARES running in minutes - no API keys required!

Prerequisites

Python 3.12 or higher
Docker - For running code agents in containers
uv - Fast Python package installer and resolver

To install uv, follow the instructions at https://docs.astral.sh/uv/getting-started/installation/

Installation

For now, we recommend running ARES locally from this directory:

uv sync --all-groups

and you're ready to get started.

Your First ARES "Agent"

No API keys needed!

Run the Hello World example to see the RL loop in action:

uv run -m examples.00_hello_world

This example uses:

✓ Local Docker containers (no cloud account needed)
✓ A mock LLM (no API keys needed)

You'll see how ARES treats code agent interactions as a reinforcement learning problem, with LLM requests as observations and LLM responses as actions.

Examples

ARES includes several examples that demonstrate different usage patterns:

1. Minimal Loop (Local Docker + Real LLM)

File: examples/01_minimal_loop.py What you'll need: Docker, Martian API key

# Set up your API key first (see Cloud Setup below)
uv run -m examples.01_minimal_loop

Shows the RL loop with a real LLM (via Martian API) and local Docker containers.

2. Local LLM (Fully Local)

File: examples/02_local_llm.py What you'll need: Docker

uv run -m examples.02_local_llm

Demonstrates running ARES completely locally using a local LLM (Qwen2.5-3B-Instruct). No cloud services required.

Cloud Setup (Optional)

For production use or larger-scale experiments, you can use cloud containers and API-based LLMs.

Option 1: Using Martian API for LLM Inference

Create an account at https://app.withmartian.com
Copy the example environment file: cp .env.example .env
Add your Martian API key: CHAT_COMPLETION_API_KEY=your_key_here

Option 2: Using Daytona for Cloud Containers

By default, ARES uses Daytona for container management. To set this up:

Create a Daytona account at https://www.daytona.io
Copy the example environment file: cp .env.example .env
Add your Daytona credentials:
- DAYTONA_API_KEY=your_key_here
- DAYTONA_API_URL=your_url_here

See .env.example for all available configuration options.

API Usage

ARES environments use an async version of the dm_env spec. Here's a complete example:

import asyncio

from ares.code_agents import mini_swe_agent
from ares.containers import docker  # Use local Docker, or import daytona for cloud
from ares.environments import swebench_env
from ares.llms import chat_completions_compatible


async def main():
    # Create an LLM client (requires CHAT_COMPLETION_API_KEY in .env)
    llm_client = chat_completions_compatible.ChatCompletionCompatibleLLMClient(
        model="openai/gpt-4o-mini"
    )

    # Load SWE-bench tasks
    all_tasks = swebench_env.swebench_verified_tasks()
    tasks = [all_tasks[0]]  # Run on only one task for now

    # Create environment with local Docker and MiniSWE agent
    async with swebench_env.SweBenchEnv(
        tasks=tasks,
        container_factory=docker.DockerContainer,  # Use local Docker
        code_agent_factory=mini_swe_agent.MiniSWECodeAgent,
    ) as env:
        # The RL loop
        ts = await env.reset()
        while not ts.last():
            # Environment sends observation (LLM request) to agent
            action = await llm_client(ts.observation)

            # Environment processes action (LLM response) and returns next state
            ts = await env.step(action)
            print(f"Step complete. Reward: {ts.reward}")


if __name__ == "__main__":
    asyncio.run(main())

This example uses:

Container backend: Local Docker (change to daytona.DaytonaContainer for cloud)
LLM backend: Martian API (or any OpenAI-compatible API)
Code agent: MiniSWE agent from the mini-swe-agent library

Project details

Release history Release notifications | RSS feed

0.0.2

Jan 29, 2026

This version

0.0.1

Jan 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

martian_ares-0.0.1.tar.gz (32.3 kB view details)

Uploaded Jan 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

martian_ares-0.0.1-py3-none-any.whl (31.7 kB view details)

Uploaded Jan 20, 2026 Python 3

File details

Details for the file martian_ares-0.0.1.tar.gz.

File metadata

Download URL: martian_ares-0.0.1.tar.gz
Upload date: Jan 20, 2026
Size: 32.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for martian_ares-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`b1a45dfbd20f8017e4b9852ac06757e0c0ea61589f0528168a3a918724f8cfac`
MD5	`f6a47cb60bb05da3f7df5d84a3d12f2d`
BLAKE2b-256	`8941e59fa30f197eebc22e413e47101bb913481336eb7f290939198c5b268f0a`

See more details on using hashes here.

File details

Details for the file martian_ares-0.0.1-py3-none-any.whl.

File metadata

Download URL: martian_ares-0.0.1-py3-none-any.whl
Upload date: Jan 20, 2026
Size: 31.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for martian_ares-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bfdf50357589e871334441b44b9a1d6e617b6d89d1f84cdace9e2ef96f859654`
MD5	`bdf8e0276b4f0d952ac98b574e0a3a9e`
BLAKE2b-256	`f7052aa4344301d0cbae6f64632f547d6ebd7e1b5acd33fa936b4f6c6c4cbec9`

See more details on using hashes here.

martian-ares 0.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

ARES

Quick Start

Prerequisites

Installation

Your First ARES "Agent"

Examples

1. Minimal Loop (Local Docker + Real LLM)

2. Local LLM (Fully Local)

Cloud Setup (Optional)

Option 1: Using Martian API for LLM Inference

Option 2: Using Daytona for Cloud Containers

API Usage

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes