Skip to main content

Inference runtime for Causal-GPT-RL policy bundles

Project description

Causal GPT RL

Public inference runtime for Causal-GPT-RL policies.

This repository contains the code needed to load a policy bundle and run it in an environment. It is intentionally focused on inference: model construction, bundle loading, action decoding, rolling context state, optional state normalization, and simple evaluation helpers.

Model creation workflows and experiment infrastructure are outside this runtime boundary.

What This Package Provides

  • causal_gpt_rl.model: autoregressive policy model definitions and JSON-safe state/action specs.
  • causal_gpt_rl.inference.load_runner(...): load an exported bundle into a ready-to-run PolicyRunner.
  • PolicyRunner: step-wise policy execution with reset(...), act(...), and observe(...).
  • run_episodes(...): small single-environment evaluation helper.
  • export_bundle(...): write a public inference bundle from an in-memory model.
  • convert_legacy_bundle_to_safetensors(...): migrate old .pt bundle weights to safetensors.

Bundle Format

A deployment bundle is a single directory:

bundle/
  model.safetensors
  config.json
  state_normalizer.safetensors  # optional

model.safetensors contains only the model state dict needed for inference. config.json contains public metadata required to reconstruct the runner: model config, observation specs, action specs, and context length. state_normalizer.safetensors is optional and stores state normalization statistics when the policy expects normalized observations.

The bundle does not include experiment metadata or development-only state.

Installation

Install the runtime dependencies in your environment:

pip install torch transformers safetensors numpy gymnasium

For MuJoCo environments, install the appropriate Gymnasium extras as well:

pip install "gymnasium[mujoco]"

To load bundles directly from Hugging Face Hub, install the Hub extra:

pip install "causal-gpt-rl[hub]"

If you are developing directly from this repository, install it editable:

pip install -e .

Quick Start

import gymnasium as gym

from causal_gpt_rl.inference import load_runner

env = gym.make("HalfCheetah-v5")
runner = load_runner(
    "path/to/bundle",
    device="cuda",          # or "cpu"
    kv_cache_max_len=None,  # default: 4 * context_length
    use_windowed=False,     # use cached incremental inference by default
)

obs, _ = env.reset()
runner.reset(obs)

done = False
while not done:
    action = runner.act()
    obs, reward, terminated, truncated, info = env.step(action)
    done = terminated or truncated
    if not done:
        runner.observe(obs)

The compatibility style runner.act(obs) is also supported. On the first call after reset(obs), the observation is already in the buffer, so act() is the cleaner form.

Evaluation Helper

For simple single-environment evaluation:

import gymnasium as gym

from causal_gpt_rl.inference import load_runner, run_episodes

env = gym.make("HalfCheetah-v5")
runner = load_runner("path/to/bundle", device="cuda")

stats = run_episodes(env, runner, num_episodes=5, seed=0)
print(stats["return_mean"], stats["return_std"])

Hugging Face Hub

Hub model repositories should use one repository with per-environment subfolders:

ccnets/causal-gpt-rl/
  ant-v5/
    model.safetensors
    config.json
    state_normalizer.safetensors  # optional
    README.md

Then load the desired environment bundle directly:

from causal_gpt_rl.inference import load_runner_from_hub

runner = load_runner_from_hub(
    repo_id="ccnets/causal-gpt-rl",
    subfolder="ant-v5",
    device="cuda",
)

Root-level bundles are still supported by omitting subfolder.

run_episodes(...) is intentionally single-env only. For vectorized or batched evaluation, drive PolicyRunner directly with num_envs > 1.

Public API

The stable top-level inference surface is:

from causal_gpt_rl.inference import (
    PolicyRunner,
    load_runner,
    run_episodes,
    export_bundle,
    convert_legacy_bundle_to_safetensors,
    load_runner_from_hub,
)

Lower-level components such as ContextBuffer, ContextCache, and StateNormalizer remain available from their submodules for advanced use, but they are not the preferred public entrypoint.

Runtime Notes

  • load_runner(...) accepts a local bundle path.
  • load_runner_from_hub(...) downloads a Hugging Face Hub model repository and then loads the bundle.
  • Continuous actions are clipped to the bounds stored in action_specs.
  • Discrete actions are decoded to integer environment actions.
  • Multi-discrete actions support batched decoding when num_envs > 1.
  • Invalid runtime sizes such as non-positive context_length, num_envs, or kv_cache_max_len raise ValueError.
  • When use_windowed=False, cached incremental inference is used. When kv_cache_max_len is omitted, the default cache cap is 4 * context_length.
  • When use_windowed=True, the full rolling window is passed each step and the KV cache is not used.

Development Checks

Useful local checks:

python -m compileall -q causal_gpt_rl
python -m unittest discover -s tests

For package build checks:

python -m build
python -m twine check dist/*

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

causal_gpt_rl-0.1.0.tar.gz (30.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

causal_gpt_rl-0.1.0-py3-none-any.whl (37.8 kB view details)

Uploaded Python 3

File details

Details for the file causal_gpt_rl-0.1.0.tar.gz.

File metadata

  • Download URL: causal_gpt_rl-0.1.0.tar.gz
  • Upload date:
  • Size: 30.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for causal_gpt_rl-0.1.0.tar.gz
Algorithm Hash digest
SHA256 dc769f5b0eb7acef89ed1dd7d1b07a71bfc79f83aad7896660667c3502377484
MD5 2c5a43ffd730d8b0d2b51705c6dc02a0
BLAKE2b-256 0ffc09553494c2ff9f3fc5475cc8e80d3f826d8c4cc6a6f71dda55413f3b1d91

See more details on using hashes here.

Provenance

The following attestation bundles were made for causal_gpt_rl-0.1.0.tar.gz:

Publisher: publish.yml on ccnets-team/causal-gpt-rl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file causal_gpt_rl-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: causal_gpt_rl-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 37.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for causal_gpt_rl-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 35ed55d5e90b345c84457e270076c9ae8915dff06eea5f2f2cef1e663e79f637
MD5 8aec1e51bcc581647103976e03c8dff1
BLAKE2b-256 82354b6267f6298dea848f78eb7aee86ac60a58b57b24fafce2022c6908dbbb7

See more details on using hashes here.

Provenance

The following attestation bundles were made for causal_gpt_rl-0.1.0-py3-none-any.whl:

Publisher: publish.yml on ccnets-team/causal-gpt-rl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page