Serverless Posttraining for Agents - Core AI functionality and tracing
Project description
Synth-AI SDK
Synth-AI — Serverless Posttraining for Agents.
Docs: Get Started →
🚀 Install latest version (0.2.25.dev1)
pip install synth-ai
# or
uv add synth-ai
Import:
import synth_ai
CLI (with uvx):
uvx synth-ai setup
uvx synth-ai demo
uvx synth-ai deploy
uvx synth-ai run
uvx synth-ai baseline # For coding agents: get baseline scores
Full quickstart: https://docs.usesynth.ai/sdk/get-started
When you run uvx synth-ai setup, the SDK opens your browser to the Synth dashboard for a one‑time pairing (handshake) with your signed‑in session. The SDK will automatically:
Fast and effective serverless posttraining for agents, via an API.
Easily scale GPU topologies, train multi-node, and integrate with existing agent software.
Highlights
- Scale GPU topologies (A10Gs, H100s, multi-node available on request)
- Thin FastAPI wrapper integration
- Supports OSS models like Qwen3 (GPT-OSS GA soon)
- Own your trained models
⚙️ Getting Started
Synth-AI ships with a built-in RL example: training Qwen3-0.6B on math reasoning.
-
Then run:
uvx synth-ai demo uvx synth-ai setup uvx synth-ai deploy uvx synth-ai run
-
To walk through your first RL run, see
👉 Synth-AI SDK Docs
🤖 For Coding Agents: Get Started with Baselines
Baselines are the fastest way for coding agents to evaluate changes and measure improvement on Synth tasks.
Why Use Baselines?
Baselines provide a self-contained evaluation system that:
- ✅ No infrastructure required — runs locally, no deployed task app needed
- ✅ Quick feedback loop — get task-by-task results in seconds
- ✅ Compare changes — establish a baseline score before making modifications
- ✅ Auto-discoverable — finds baseline files automatically in your codebase
Quick Start for Coding Agents
# 1. List available baselines
uvx synth-ai baseline list
# 2. Run a quick 3-task baseline to get started
uvx synth-ai baseline banking77 --split train --seeds 0,1,2
# 3. Get your baseline score (full train split)
uvx synth-ai baseline banking77 --split train
# 4. Make your changes to the code...
# 5. Re-run to compare performance
uvx synth-ai baseline banking77 --split train --output results_after.json
Available Baselines
# Filter by task type
uvx synth-ai baseline list --tag rl # RL tasks
uvx synth-ai baseline list --tag nlp # NLP tasks
uvx synth-ai baseline list --tag vision # Vision tasks
# Run specific baselines
uvx synth-ai baseline warming_up_to_rl # Crafter survival game
uvx synth-ai baseline pokemon_vl # Pokemon Red (vision)
uvx synth-ai baseline gepa # Banking77 classification
Baseline Results
Each baseline run provides:
- Task-by-task results — see exactly which seeds succeed/fail
- Aggregate metrics — success rate, mean/std rewards, total tasks
- Serializable output — save to JSON with
--output results.json - Model comparison — test different models with
--model
Example output:
============================================================
Baseline Evaluation: Banking77 Intent Classification
============================================================
Split(s): train
Tasks: 10
Success: 8/10
Execution time: 12.34s
Aggregate Metrics:
mean_outcome_reward: 0.8000
success_rate: 0.8000
total_tasks: 10
Creating Custom Baselines
Coding agents can create new baseline files to test custom tasks:
# my_task_baseline.py
from synth_ai.baseline import BaselineConfig, BaselineTaskRunner, DataSplit, TaskResult
class MyTaskRunner(BaselineTaskRunner):
async def run_task(self, seed: int) -> TaskResult:
# Your task logic here
return TaskResult(...)
my_baseline = BaselineConfig(
baseline_id="my_task",
name="My Custom Task",
description="Evaluate my custom task",
task_runner=MyTaskRunner,
splits={
"train": DataSplit(name="train", seeds=list(range(10))),
},
)
Place this file in examples/baseline/ or name it *_baseline.py for auto-discovery.
🔐 SDK → Dashboard Pairing
When you run uvx synth-ai setup (or legacy uvx synth-ai rl_demo setup):
-
The SDK opens your browser to the Synth dashboard to pair your SDK with your signed-in session.
-
Automatically detects your user + organization
-
Ensures both API keys exist
-
Writes them to your project’s
.envas:SYNTH_API_KEY= ENVIRONMENT_API_KEY=
✅ No keys printed or requested interactively — all handled via browser pairing.
Environment overrides
SYNTH_CANONICAL_ORIGIN→ override dashboard base URL (default: https://www.usesynth.ai/dashboard)SYNTH_CANONICAL_DEV→1|true|onto use local dashboard (http://localhost:3000)
🎯 Prompt Optimization
Automatically optimize prompts for classification, reasoning, and instruction-following tasks using evolutionary algorithms. Synth supports two state-of-the-art algorithms: GEPA (Genetic Evolution of Prompt Architectures) and MIPRO (Meta-Instruction PROposer).
References:
- GEPA: Agrawal et al. (2025). "GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning." arXiv:2507.19457
- MIPRO: Opsahl-Ong et al. (2024). "Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs." arXiv:2406.11695
How It Works
Prompt optimization uses an interceptor pattern that ensures optimized prompts never reach task apps. All prompt modifications happen in the backend via an inference interceptor that substitutes prompts before they reach the LLM.
✅ CORRECT FLOW:
Backend → register_prompt → Interceptor → substitutes → LLM
❌ WRONG FLOW:
Backend → prompt_template in payload → Task App (NEVER DO THIS)
Algorithms
GEPA (Genetic Evolution of Prompt Architectures)
- Population-based evolutionary search
- LLM-guided mutations for intelligent prompt modifications
- Pareto optimization balancing performance and prompt length
- Best for: Broad exploration, diverse prompt variants, classification tasks
- Results: Improves accuracy from 60-75% (baseline) to 85-90%+ over 15 generations
MIPRO (Meta-Instruction PROposer)
- Meta-LLM (e.g., GPT-4o-mini) generates instruction variants
- TPE (Tree-structured Parzen Estimator) guides Bayesian search
- Bootstrap phase collects few-shot examples from high-scoring seeds
- Best for: Efficient optimization, task-specific improvements, faster convergence
- Results: Achieves similar accuracy gains with fewer evaluations (~96 rollouts vs ~1000 for GEPA)
Quick Start
-
Build a prompt evaluation task app
# Task app evaluates prompt performance (classification accuracy, QA correctness, etc.) -
Create a prompt learning config
[prompt_learning] algorithm = "gepa" # or "mipro" task_app_url = "https://my-task-app.modal.run" [prompt_learning.initial_prompt] messages = [ { role = "system", content = "You are a banking assistant..." }, { role = "user", pattern = "Customer Query: {query}..." } ] [prompt_learning.gepa] initial_population_size = 20 num_generations = 15
-
Launch optimization
uvx synth-ai train --type prompt_learning --config config.toml
-
Query results
from synth_ai.learning import get_prompt_text best_prompt = get_prompt_text(job_id="pl_abc123", rank=1)
Full documentation: Prompt Learning Guide →
📚 Documentation
- SDK Docs: https://docs.usesynth.ai/sdk/get-started
- Prompt Learning: https://docs.usesynth.ai/prompt-learning/overview
- CLI Reference: https://docs.usesynth.ai/cli
- API Reference: https://docs.usesynth.ai/api
- Changelog: https://docs.usesynth.ai/changelog
🧠 Meta
- Package:
synth-ai - Import:
synth_ai - Source: github.com/synth-laboratories/synth-ai
- License: MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file synth_ai-0.2.26.dev1.tar.gz.
File metadata
- Download URL: synth_ai-0.2.26.dev1.tar.gz
- Upload date:
- Size: 2.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1c79f17ac4cd78e1070bb4eb0cf092eed7176e02720cb1b3d094cee4803840f5
|
|
| MD5 |
05646537d3ff5f7c341c75bac7b5e531
|
|
| BLAKE2b-256 |
1b9dfd1a0d2cfefa0dbcaa7e29622db279ea120d20ecb4d008217a9c0a4677ec
|
File details
Details for the file synth_ai-0.2.26.dev1-py3-none-any.whl.
File metadata
- Download URL: synth_ai-0.2.26.dev1-py3-none-any.whl
- Upload date:
- Size: 3.3 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c01805894cd2e5969e721db40873a97eef3b8ccd5dfeade53c8888044bb80eed
|
|
| MD5 |
b443881b99f952680d5b15741733366d
|
|
| BLAKE2b-256 |
622761839ca3990d3184a6470e55a332c6ed7f3d85424307dadc860a5e0988f7
|