Python SDK for the OpenReward platform.
Project description
OpenReward Python SDK
The official Python SDK for OpenReward — a platform for building, hosting, and training on RL environments for language models.
The SDK has two complementary roles:
- Build environments — define evaluation tasks, expose tools, and serve them via a standards-compliant API that can be deployed on the OpenReward platform.
- Train agents — connect to any environment (local or hosted), run agent loops, and log rollouts with rewards back to OpenReward.
Installation
pip install openreward
For environments that process documents (PDF, DOCX, Excel, PowerPoint):
pip install "openreward[tools]"
Requires Python 3.11+.
Core concepts
Environment
An Environment subclass defines a benchmark or task distribution. Implement three required methods:
| Method | Purpose |
|---|---|
list_splits() |
Return split names, e.g. ["train", "test"] |
list_tasks(split) |
Return a deterministically ordered list of task dicts |
get_prompt() |
Return the task instructions as a list of TextBlock / ImageBlock |
Actions are defined as async methods decorated with @tool. Each tool receives a Pydantic model as input and returns a ToolOutput.
ToolOutput
Every tool returns a ToolOutput containing:
blocks— a list ofTextBlockorImageBlockresultsreward— optional float reward signalfinished— whether the episode is completemetadata— optional arbitrary metadata
Server
Server wraps one or more Environment classes in a FastAPI app and exposes the Open Reward Standard API over HTTP with SSE streaming.
Key endpoints:
| Endpoint | Description |
|---|---|
POST /create |
Spawn a new environment session |
POST /{env}/call |
Execute a tool (streamed via SSE) |
GET /{env}/prompt |
Get the current task prompt |
GET /{env}/tools |
List available tools |
POST /{env}/tasks |
List all tasks for a split |
Sandboxes
Environments that need isolated compute (e.g. code execution) can spin up Docker containers via the sandbox API using SandboxSettings. Containers are managed automatically — started in setup() and torn down in teardown().
Toolsets
Group reusable tools into Toolset classes and compose them across environments via the toolsets class attribute.
Rollout logging
Log agent trajectories with reward signals back to OpenReward for analysis and training. The client's rollout API supports normalized message types as well as raw outputs from Anthropic, OpenAI, and Google GenAI SDKs.
Scaffolding a new environment
The orwd CLI generates a project skeleton:
# Minimal environment
orwd init my-env
# Environment with a Docker sandbox for code execution
orwd init my-env --template sandbox
Deploying to OpenReward
- Push your environment to a GitHub repository.
- Connect the repository in the OpenReward dashboard.
- Configure compute resources (CPU, memory, scaling).
- Every push to the connected branch triggers an automatic build and deployment.
Your environment is then accessible to any agent via the OpenReward API using the username/environment-name namespace.
Environment variables
| Variable | Description |
|---|---|
OPENREWARD_API_KEY |
API key for authentication |
OPENREWARD_URL |
Override base URL (default: https://openreward.ai) |
OPENREWARD_USE_STRUCTURED_LOGS |
Set to 1 for JSON logging (recommended in production) |
OPENREWARD_ROLLOUT_LOGGING_FORMAT |
pretty or structured for rollout log output |
Documentation
Full documentation, guides, and examples are at docs.openreward.ai.
License
Apache 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file openreward-0.1.84.tar.gz.
File metadata
- Download URL: openreward-0.1.84.tar.gz
- Upload date:
- Size: 85.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c4976b219c22b4981584d4eee3b136e9bdfe58db0736b7e859745cd8a82d58bf
|
|
| MD5 |
d54c3c94fead2f65ff8e4a6d68bc26c6
|
|
| BLAKE2b-256 |
24a6f330f2802ee517a9a66e5a0a30f8427fa300d30cdc7df40f031260206e70
|
File details
Details for the file openreward-0.1.84-py3-none-any.whl.
File metadata
- Download URL: openreward-0.1.84-py3-none-any.whl
- Upload date:
- Size: 89.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
463b64872e94f4139456769bf5caef8ff4fdf0263ced93a73fe29b759a51670e
|
|
| MD5 |
b931d5ac45d853b28ddb42a5e2380a16
|
|
| BLAKE2b-256 |
f02c476aacc524ede9b009f048dc4aa2cb56720803c009943063c1d7af046e71
|