Skip to main content

RL as a service SDK - Core AI functionality and tracing

Project description

Synth-AI SDK

Python License PyPI Coverage Tests Blacksmith CI

Synth-AI — Reinforcement Learning-as-a-Service for agents.
Docs: Get Started →


🚀 Install version 0.2.16

pip install synth-ai
# or
uv add synth-ai

Import:

import synth_ai

CLI (with uvx):

uvx synth-ai setup
uvx synth-ai demo
uvx synth-ai deploy
uvx synth-ai run
uvx synth-ai baseline  # For coding agents: get baseline scores

Full quickstart: https://docs.usesynth.ai/sdk/get-started


When you run uvx synth-ai setup, the SDK opens your browser to the Synth dashboard for a one‑time pairing (handshake) with your signed‑in session. The SDK will automatically:

Fast and effective reinforcement learning for agents, via an API.
Easily scale GPU topologies, train multi-node, and integrate with existing agent software.

Highlights

  • Scale GPU topologies (A10Gs, H100s, multi-node available on request)
  • Thin FastAPI wrapper integration
  • Supports OSS models like Qwen3 (GPT-OSS GA soon)
  • Own your trained models

⚙️ Getting Started

Synth-AI ships with a built-in RL example: training Qwen3-0.6B on math reasoning.

  1. Create accounts at Synth and Modal

  2. Then run:

    uvx synth-ai demo
    uvx synth-ai setup
    uvx synth-ai deploy
    uvx synth-ai run
    
  3. To walk through your first RL run, see
    👉 Synth-AI SDK Docs


🤖 For Coding Agents: Get Started with Baselines

Baselines are the fastest way for coding agents to evaluate changes and measure improvement on Synth tasks.

Why Use Baselines?

Baselines provide a self-contained evaluation system that:

  • No infrastructure required — runs locally, no deployed task app needed
  • Quick feedback loop — get task-by-task results in seconds
  • Compare changes — establish a baseline score before making modifications
  • Auto-discoverable — finds baseline files automatically in your codebase

Quick Start for Coding Agents

# 1. List available baselines
uvx synth-ai baseline list

# 2. Run a quick 3-task baseline to get started
uvx synth-ai baseline banking77 --split train --seeds 0,1,2

# 3. Get your baseline score (full train split)
uvx synth-ai baseline banking77 --split train

# 4. Make your changes to the code...

# 5. Re-run to compare performance
uvx synth-ai baseline banking77 --split train --output results_after.json

Available Baselines

# Filter by task type
uvx synth-ai baseline list --tag rl          # RL tasks
uvx synth-ai baseline list --tag nlp         # NLP tasks
uvx synth-ai baseline list --tag vision      # Vision tasks

# Run specific baselines
uvx synth-ai baseline warming_up_to_rl       # Crafter survival game
uvx synth-ai baseline pokemon_vl             # Pokemon Red (vision)
uvx synth-ai baseline gepa                   # Banking77 classification

Baseline Results

Each baseline run provides:

  • Task-by-task results — see exactly which seeds succeed/fail
  • Aggregate metrics — success rate, mean/std rewards, total tasks
  • Serializable output — save to JSON with --output results.json
  • Model comparison — test different models with --model

Example output:

============================================================
Baseline Evaluation: Banking77 Intent Classification
============================================================
Split(s): train
Tasks: 10
Success: 8/10
Execution time: 12.34s

Aggregate Metrics:
  mean_outcome_reward: 0.8000
  success_rate: 0.8000
  total_tasks: 10

Creating Custom Baselines

Coding agents can create new baseline files to test custom tasks:

# my_task_baseline.py
from synth_ai.baseline import BaselineConfig, BaselineTaskRunner, DataSplit, TaskResult

class MyTaskRunner(BaselineTaskRunner):
    async def run_task(self, seed: int) -> TaskResult:
        # Your task logic here
        return TaskResult(...)

my_baseline = BaselineConfig(
    baseline_id="my_task",
    name="My Custom Task",
    description="Evaluate my custom task",
    task_runner=MyTaskRunner,
    splits={
        "train": DataSplit(name="train", seeds=list(range(10))),
    },
)

Place this file in examples/baseline/ or name it *_baseline.py for auto-discovery.


🔐 SDK → Dashboard Pairing

When you run uvx synth-ai setup (or legacy uvx synth-ai rl_demo setup):

  • The SDK opens your browser to the Synth dashboard to pair your SDK with your signed-in session.

  • Automatically detects your user + organization

  • Ensures both API keys exist

  • Writes them to your project’s .env as:

    SYNTH_API_KEY=
    ENVIRONMENT_API_KEY=
    

✅ No keys printed or requested interactively — all handled via browser pairing.

Environment overrides


📚 Documentation


🧠 Meta

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

synth_ai-0.2.21.dev1.tar.gz (1.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

synth_ai-0.2.21.dev1-py3-none-any.whl (2.3 MB view details)

Uploaded Python 3

File details

Details for the file synth_ai-0.2.21.dev1.tar.gz.

File metadata

  • Download URL: synth_ai-0.2.21.dev1.tar.gz
  • Upload date:
  • Size: 1.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.15

File hashes

Hashes for synth_ai-0.2.21.dev1.tar.gz
Algorithm Hash digest
SHA256 84f42e5690d88e1316ad81070e798b55db5bcc220712eafd736b5558fabf2a31
MD5 20c8d3029da1da9a10d25d92c35e5f34
BLAKE2b-256 66cc48bdf249f0890d63e6d70c41b8a3b46fac62563b4d23d788e8438bb681b0

See more details on using hashes here.

File details

Details for the file synth_ai-0.2.21.dev1-py3-none-any.whl.

File metadata

File hashes

Hashes for synth_ai-0.2.21.dev1-py3-none-any.whl
Algorithm Hash digest
SHA256 993f81f77bcb3ece469a63ac8ec61f96f72567bc348c8a08a3f34ca545104d7a
MD5 239a9388e652cedee3879890e5e1c4e4
BLAKE2b-256 8fa17a0b91b94f74423b99ace67763a25470cbd226bce3a838a726bd7a71d633

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page