Skip to main content

Harness Gym — environments for agents whose policy boundary is the tool surface

Project description

hgym — Harness Gym

Environments for agents whose policy boundary is the tool surface.

hgym is a fresh start of LLM Gym, rebuilt around one idea: the tool surface — the set of tools an agent can call — is the program. Same environment, same model: {search, compose, terminate} is one research program; {search, compose, draft, critique, revise, terminate} is draft-and-revise; {answer, terminate} is one-shot. Surfaces compose as config diffs, not Python forks, which makes them a target an optimizer can range over.

Environments are ToolUsingEnvs: a task loader, an initial observation, a pure verifier, and a set of MCP servers. Tools are MCP servers (in-process, stdio, or HTTP); episodes are session-keyed for safe concurrent rollouts; termination is a reserved terminate tool or the horizon; verification is a pure function over the recorded trajectory.

Design goals:

  • Zero infrastructure. pip install hgym, export an API key, run an episode. No Docker, no databases, no gateway servers. Traces are local JSONL files.
  • Provider-neutral. A thin model-client seam speaking the OpenAI-compatible wire schema; bring any provider, proxy, or local server.
  • The optimizer workflow is first-class. Export an environment's harness as an editable directory (templates, model, tool manifest), let a human or an agent edit it, re-run, score.

Status

Pre-alpha. The core engine is ported and under active development; see the spec and roadmap. Not yet ready for use.

Quickstart (target API)

import hgym

rollouts = await hgym.run_episodes(
    env_name="wordle_v1",
    model="openai/gpt-5.2-mini",
    num_tasks=50,
)

License

Apache-2.0. Portions derived from llmgym (© TensorZero, Apache-2.0) — see NOTICE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hgym-0.0.1.tar.gz (158.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hgym-0.0.1-py3-none-any.whl (78.3 kB view details)

Uploaded Python 3

File details

Details for the file hgym-0.0.1.tar.gz.

File metadata

  • Download URL: hgym-0.0.1.tar.gz
  • Upload date:
  • Size: 158.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.20 {"installer":{"name":"uv","version":"0.11.20","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for hgym-0.0.1.tar.gz
Algorithm Hash digest
SHA256 3ce3046bd77eb5570878d6f512b6c0dd02e3afa46d01efebd85ae1e41493f347
MD5 f97ffc438ddcc8831b6b6e953da6c531
BLAKE2b-256 60158d6733d7f44e6799c83b6a89330b7d7e907268be018c47aa965b60108b37

See more details on using hashes here.

File details

Details for the file hgym-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: hgym-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 78.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.20 {"installer":{"name":"uv","version":"0.11.20","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for hgym-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7ab8ace84582f6800d0153593515a774f3a07126986c92afea0bbf0c7d360172
MD5 4b56fff861bc0179640d7929a9f9401a
BLAKE2b-256 c5c15676e96227ced05556b20040ce7cfdb1af81b3138848982f09f4fc7130c7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page