Skip to main content

Python SDK for the OpenReward platform.

Project description

OpenReward Python SDK

PyPI version Python 3.11+ Docs

The official Python SDK for OpenReward — a platform for building, hosting, and training on RL environments for language models.

The SDK has two complementary roles:

  • Build environments — define evaluation tasks, expose tools, and serve them via a standards-compliant API that can be deployed on the OpenReward platform.
  • Train agents — connect to any environment (local or hosted), run agent loops, and log rollouts with rewards back to OpenReward.

Installation

pip install openreward

For environments that process documents (PDF, DOCX, Excel, PowerPoint):

pip install "openreward[tools]"

Requires Python 3.11+.

Core concepts

Environment

An Environment subclass defines a benchmark or task distribution. Implement three required methods:

Method Purpose
list_splits() Return split names, e.g. ["train", "test"]
list_tasks(split) Return a deterministically ordered list of task dicts
get_prompt() Return the task instructions as a list of TextBlock / ImageBlock

Actions are defined as async methods decorated with @tool. Each tool receives a Pydantic model as input and returns a ToolOutput.

ToolOutput

Every tool returns a ToolOutput containing:

  • blocks — a list of TextBlock or ImageBlock results
  • reward — optional float reward signal
  • finished — whether the episode is complete
  • metadata — optional arbitrary metadata

Server

Server wraps one or more Environment classes in a FastAPI app and exposes the Open Reward Standard API over HTTP with SSE streaming.

Key endpoints:

Endpoint Description
POST /create Spawn a new environment session
POST /{env}/call Execute a tool (streamed via SSE)
GET /{env}/prompt Get the current task prompt
GET /{env}/tools List available tools
POST /{env}/tasks List all tasks for a split

Sandboxes

Environments that need isolated compute (e.g. code execution) can spin up Docker containers via the sandbox API using SandboxSettings. Containers are managed automatically — started in setup() and torn down in teardown().

Toolsets

Group reusable tools into Toolset classes and compose them across environments via the toolsets class attribute.

Rollout logging

Log agent trajectories with reward signals back to OpenReward for analysis and training. The client's rollout API supports normalized message types as well as raw outputs from Anthropic, OpenAI, and Google GenAI SDKs.

Scaffolding a new environment

The orwd CLI generates a project skeleton:

# Minimal environment
orwd init my-env

# Environment with a Docker sandbox for code execution
orwd init my-env --template sandbox

Deploying to OpenReward

  1. Push your environment to a GitHub repository.
  2. Connect the repository in the OpenReward dashboard.
  3. Configure compute resources (CPU, memory, scaling).
  4. Every push to the connected branch triggers an automatic build and deployment.

Your environment is then accessible to any agent via the OpenReward API using the username/environment-name namespace.

Environment variables

Variable Description
OPENREWARD_API_KEY API key for authentication
OPENREWARD_URL Override base URL (default: https://openreward.ai)
OPENREWARD_USE_STRUCTURED_LOGS Set to 1 for JSON logging (recommended in production)
OPENREWARD_ROLLOUT_LOGGING_FORMAT pretty or structured for rollout log output

Documentation

Full documentation, guides, and examples are at docs.openreward.ai.

License

Apache 2.0

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openreward-0.1.81.tar.gz (85.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openreward-0.1.81-py3-none-any.whl (89.2 kB view details)

Uploaded Python 3

File details

Details for the file openreward-0.1.81.tar.gz.

File metadata

  • Download URL: openreward-0.1.81.tar.gz
  • Upload date:
  • Size: 85.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for openreward-0.1.81.tar.gz
Algorithm Hash digest
SHA256 1a5a83abc28ba1ee3ff5f114b1fe9d20bec7bf1682867f48ce241f38c179817d
MD5 1ea0b9befba9d110b925323b53bdfef8
BLAKE2b-256 1f3a03c04941b52eddfc6c30212c86c2c887bd8bf9f68f9dc3b5d8451c2bbeb8

See more details on using hashes here.

File details

Details for the file openreward-0.1.81-py3-none-any.whl.

File metadata

  • Download URL: openreward-0.1.81-py3-none-any.whl
  • Upload date:
  • Size: 89.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for openreward-0.1.81-py3-none-any.whl
Algorithm Hash digest
SHA256 127b3dac1d6d3401f5074c95b9b6487cde4e0adedab22c4d44313650cbf57cc6
MD5 fe9df1aff47a0f3c0d759cb9abc28873
BLAKE2b-256 dbcd8c6b7785eb72f2cddf810db3bf6d87f50419b20d45390e349c468be4816d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page