hud-python

SDK for the HUD platform.

These details have not been verified by PyPI

Project links

Project description

HUD is a platform for building RL environments for AI agents, across coding, browser, computer-use, and robotics. Define an environment, write tasks, and run them as evals and training across any model, at any scale.

To learn more, see the documentation and environment reference.

Install

# Install the CLI (recommended)
uv tool install hud-python --python 3.12

# …or as a library
pip install hud-python

Get your API key at hud.ai/project/api-keys and set it:

hud set HUD_API_KEY=your-key-here
# or: export HUD_API_KEY=your-key-here

Then scaffold your first environment:

hud init my-env

Agent running on SheetBench

The protocol

HUD is protocol-first. An agent and an environment exchange just three things: a manifest (the environment's capabilities and tasks), tasks.start that returns the prompt, and tasks.grade that returns the reward. In between, the agent just works, driving the capabilities itself. HUD owns only that thin envelope, so any model or harness plugs into any environment.

sequenceDiagram
    participant Agent
    participant Env as Environment
    participant Caps as Capabilities (ssh · mcp · cdp · rfb · robot)
    Agent->>Env: manifest exchange
    Env-->>Agent: capabilities + tasks
    Agent->>Env: tasks.start
    Env-->>Agent: prompt
    rect rgb(238,238,238)
    Note over Agent,Caps: the agent works, driving capabilities directly
    Agent->>Caps: shell · browser · GUI · tools · robot
    Caps-->>Agent: observations
    end
    Agent->>Env: tasks.grade
    Env-->>Agent: reward

Because the protocol only exposes capabilities (never a fixed agent), an environment outlives any single harness: new harnesses and models keep running against the same environments, benchmarks, and tasks.

Package & run anywhere

A built image is the end product for your tasks: one build packs every task from a single definition. The recommended path is hud deploy, which builds and registers your environment on HUD in one step; then sync a taskset and run remotely:

hud deploy
hud sync tasks my-taskset
hud eval my-taskset --remote

For local iteration, the same protocol works against a container on your laptop:

docker build -f Dockerfile.hud -t my-env .
docker run -d --name run1 -p 8765:8765 my-env
hud task start fix_bug --url tcp://127.0.0.1:8765
hud task grade fix_bug --url tcp://127.0.0.1:8765 --answer "..."
docker rm -f run1

→ Run & deploy

Environments & templates

A template is an async generator registered with @env.template(): yield a prompt, receive the agent's answer, yield a reward. Calling the template mints a runnable Task; one function spans a whole dataset of variants. The simplest needs no capabilities — just a prompt and a grader:

from hud import Environment

env = Environment(name="letter-count")

@env.template()
async def count_letter(word: str = "strawberry", letter: str = "r"):
    answer = yield f"How many '{letter}'s are in '{word}'? Reply with just the number."
    yield 1.0 if answer and str(word.count(letter)) in answer else 0.0

tasks = [count_letter(word=w) for w in ("strawberry", "raspberry", "blueberry")]

Run it immediately against any model:

hud eval tasks.py claude --group 3

Each graded evaluation is a trace (the SDK's live handle is a Run). With HUD_API_KEY set, every rollout is recorded on hud.ai. Tasks that need a shell, browser, GUI, or robot declare capabilities (below); everything else — variants, grading, batching — stays identical.

→ Quickstart · Tasks & tasksets

Capabilities & harnesses

A capability is a connection the environment exposes; a harness attaches its own tools to it. The same environment serves a one-shot Q&A or a full computer-use rollout, depending on which capabilities the harness opens.

Protocol	What it exposes
`ssh`	Shell + files in a sandboxed workspace (`env.workspace(root)`)
`mcp`	Tools over the Model Context Protocol
`cdp`	Browser control over the Chrome DevTools Protocol
`rfb`	Full computer-use over VNC: screen + keyboard/mouse
`robot` (beta)	Schema-driven robot observation/action loop over WebSocket

Ships natively: Claude, OpenAI (Responses), OpenAI-compatible endpoints, and Gemini via create_agent("claude-sonnet-4-5") (or gpt-…, gemini-…). The harness wires capability-backed tools for the model you choose at run time.

Bring your own: a harness attaches to a capability and defines a tool spec — wrap browser-use on cdp, a VLA policy on robot, or your own agent on ssh / mcp. No protocol work required.

→ Capabilities · Models · Robots

Deploy on the platform

From the platform UI you can run batches, compare models on the same taskset, and inspect every trace.

→ Run & deploy

Train on rewards

Every rollout returns a Run carrying a trace_id and a reward, so the tasks you evaluate are already training data. Run a group per task and pass the graded runs to TrainingClient.step():

from hud import TrainingClient
from hud.agents import create_agent
from hud.eval import Job

agent = create_agent("arith-rl", completion_kwargs={"extra_body": {"return_token_ids": True}})
trainer = TrainingClient("arith-rl")
taskset, runtime = ...  # your Taskset and where rollouts run

session = await Job.start("arith-rl", group=8)
start = len(session.runs)
await taskset.run(agent, runtime=runtime, group=8, job=session)
await trainer.step(session.runs[start:], learning_rate=1e-5, group_size=8)

HUD is the environment-and-reward source for your own GRPO/PPO loop — the same environment trains any model, text or multimodal, unchanged.

→ Training · Designing tasks for signal

Enterprise

Building agents at scale? We work with teams on custom environments, benchmarks, and training.

📅 Book a call · 📧 founders@hud.ai

Contributing

We welcome contributions! See CONTRIBUTING.md.

Key areas: Agents · Environments · Capabilities · Eval

Citation

@software{hud2025agentevalplatform,
  author = {HUD and Jay Ram and Lorenss Martinsons and Parth Patel and Govind Pimpale and Dylan Bowman and Jaideep Chawla and Nguyen Nhat Minh},
  title  = {HUD: An Evaluation and RL Environments Platform for Agents},
  date   = {2025-04},
  url    = {https://github.com/hud-evals/hud-python},
  langid = {en}
}

MIT License · LICENSE

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.6.8.dev1 pre-release

Jun 25, 2026

This version

0.6.8.dev0 pre-release

Jun 25, 2026

0.6.7

Jun 21, 2026

0.6.6

Jun 20, 2026

0.6.5

Jun 20, 2026

0.6.4

Jun 20, 2026

0.6.3

Jun 20, 2026

0.6.2

Jun 20, 2026

0.6.1

Jun 19, 2026

0.6.0

Jun 19, 2026

0.5.41

Apr 28, 2026

0.5.40

Apr 26, 2026

0.5.39

Apr 20, 2026

0.5.38 yanked

Apr 19, 2026

0.5.37 yanked

Apr 16, 2026

0.5.36 yanked

Apr 16, 2026

0.5.35 yanked

Apr 7, 2026

0.5.34 yanked

Mar 24, 2026

0.5.33 yanked

Mar 17, 2026

0.5.32 yanked

Mar 17, 2026

0.5.31 yanked

Mar 13, 2026

0.5.30 yanked

Mar 11, 2026

0.5.29

Feb 28, 2026

0.5.28

Feb 26, 2026

0.5.27

Feb 22, 2026

0.5.26

Feb 19, 2026

0.5.25

Feb 17, 2026

0.5.24

Feb 13, 2026

0.5.23

Feb 12, 2026

0.5.22

Feb 9, 2026

0.5.21

Feb 8, 2026

0.5.20

Feb 7, 2026

0.5.19

Feb 7, 2026

0.5.18

Feb 3, 2026

0.5.17

Jan 29, 2026

0.5.16

Jan 26, 2026

0.5.15

Jan 22, 2026

0.5.14

Jan 21, 2026

0.5.13

Jan 18, 2026

0.5.12

Jan 16, 2026

0.5.11

Jan 15, 2026

0.5.10

Jan 15, 2026

0.5.9

Jan 14, 2026

0.5.8

Jan 13, 2026

0.5.7

Jan 13, 2026

0.5.6

Jan 12, 2026

0.5.5

Jan 11, 2026

0.5.4

Jan 9, 2026

0.5.3

Jan 9, 2026

0.5.2

Jan 7, 2026

0.5.1

Jan 2, 2026

0.5.0

Dec 17, 2025

0.4.74

Dec 12, 2025

0.4.73

Dec 8, 2025

0.4.72

Dec 7, 2025

0.4.71

Dec 7, 2025

0.4.70

Dec 5, 2025

0.4.69

Dec 1, 2025

0.4.68

Nov 28, 2025

0.4.67

Nov 22, 2025

0.4.66

Nov 21, 2025

0.4.65

Nov 20, 2025

0.4.64

Nov 20, 2025

0.4.63

Nov 20, 2025

0.4.62

Nov 8, 2025

0.4.61

Nov 7, 2025

0.4.60

Nov 3, 2025

0.4.59

Oct 29, 2025

0.4.58

Oct 25, 2025

0.4.57

Oct 24, 2025

0.4.56

Oct 23, 2025

0.4.55

Oct 23, 2025

0.4.54

Oct 20, 2025

0.4.53

Oct 12, 2025

0.4.52

Oct 2, 2025

0.4.51

Oct 1, 2025

0.4.50

Oct 1, 2025

0.4.49

Oct 1, 2025

0.4.48

Oct 1, 2025

0.4.47

Sep 26, 2025

0.4.46

Sep 26, 2025

0.4.45

Sep 26, 2025

0.4.44

Sep 24, 2025

0.4.43

Sep 24, 2025

0.4.42

Sep 24, 2025

0.4.41

Sep 24, 2025

0.4.40

Sep 23, 2025

0.4.39

Sep 23, 2025

0.4.38

Sep 23, 2025

0.4.37

Sep 23, 2025

0.4.36

Sep 22, 2025

0.4.35

Sep 22, 2025

0.4.34

Sep 20, 2025

0.4.33

Sep 19, 2025

0.4.32

Sep 19, 2025

0.4.31

Sep 19, 2025

0.4.30

Sep 18, 2025

0.4.29

Sep 18, 2025

0.4.28

Sep 18, 2025

0.4.27

Sep 17, 2025

0.4.26

Sep 14, 2025

0.4.25

Sep 13, 2025

0.4.24

Sep 12, 2025

0.4.23

Sep 12, 2025

0.4.22

Sep 11, 2025

0.4.21

Sep 9, 2025

0.4.20

Sep 8, 2025

0.4.19

Sep 7, 2025

0.4.18

Sep 5, 2025

0.4.17

Aug 31, 2025

0.4.16

Aug 30, 2025

0.4.15

Aug 30, 2025

0.4.14

Aug 27, 2025

0.4.13

Aug 27, 2025

0.4.12

Aug 27, 2025

0.4.11

Aug 26, 2025

0.4.10

Aug 26, 2025

0.4.9

Aug 26, 2025

0.4.8

Aug 26, 2025

0.4.7

Aug 26, 2025

0.4.6

Aug 26, 2025

0.4.5

Aug 26, 2025

0.4.4

Aug 26, 2025

0.4.3

Aug 26, 2025

0.4.2

Aug 26, 2025

0.4.1

Aug 24, 2025

0.4.0

Aug 24, 2025

0.3.5

Aug 5, 2025

0.3.4

Aug 5, 2025

0.3.3

Aug 5, 2025

0.3.2

Aug 5, 2025

0.3.1

Aug 5, 2025

0.3.0

Aug 2, 2025

0.2.10

Jul 21, 2025

0.2.9

Jul 21, 2025

0.2.8

Jul 17, 2025

0.2.7

Jun 24, 2025

0.2.6

May 28, 2025

0.2.5

May 26, 2025

0.2.4

May 6, 2025

0.2.3

May 6, 2025

0.2.2

Apr 29, 2025

0.2.1

Apr 26, 2025

0.2.0

Apr 18, 2025

0.1.5

Apr 6, 2025

0.1.4

Apr 2, 2025

0.1.3

Mar 31, 2025

0.1.2a0 pre-release

Mar 30, 2025

0.1.1

Mar 30, 2025

0.1.0b3 pre-release

Mar 10, 2025

0.1.0b2 pre-release

Mar 4, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hud_python-0.6.8.dev0.tar.gz (338.6 kB view details)

Uploaded Jun 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hud_python-0.6.8.dev0-py3-none-any.whl (435.3 kB view details)

Uploaded Jun 25, 2026 Python 3

File details

Details for the file hud_python-0.6.8.dev0.tar.gz.

File metadata

Download URL: hud_python-0.6.8.dev0.tar.gz
Upload date: Jun 25, 2026
Size: 338.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.24 {"installer":{"name":"uv","version":"0.11.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for hud_python-0.6.8.dev0.tar.gz
Algorithm	Hash digest
SHA256	`5f3e7c39b80bbacd9dd20d67b021c1ca84fff9802e47dc693de1df030f18a8bb`
MD5	`dfe30c32ac97eb75c97e697395c7c22b`
BLAKE2b-256	`69485a9110382b4512f43cb57ddcdc65b8c7142d29767c3713938e4b310ade71`

See more details on using hashes here.

File details

Details for the file hud_python-0.6.8.dev0-py3-none-any.whl.

File metadata

Download URL: hud_python-0.6.8.dev0-py3-none-any.whl
Upload date: Jun 25, 2026
Size: 435.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.24 {"installer":{"name":"uv","version":"0.11.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for hud_python-0.6.8.dev0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9a35173ac01b796ce2c3aba4880a93c21546ff9c6e1e518cd07ba58ff020b59d`
MD5	`08f2e144b8dddb336f2c33c71ef50e9b`
BLAKE2b-256	`4c003ceb7c958830c7c4ad7d8e0e85ad1643a7306d2b298448a5073a8a398f2d`

See more details on using hashes here.

hud-python 0.6.8.dev0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Install

The protocol

Package & run anywhere

Environments & templates

Capabilities & harnesses

Deploy on the platform

Train on rewards

Links

Enterprise

Contributing

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes