Inference runtime for Causal-GPT-RL policy bundles
Project description
Causal GPT RL
Public inference runtime for Causal-GPT-RL policies.
This repository contains the code needed to load a policy bundle and run it in an environment. It is intentionally focused on inference: model construction, bundle loading, action decoding, rolling context state, optional state normalization, and simple evaluation helpers.
Model creation workflows and experiment infrastructure are outside this runtime boundary.
What This Package Provides
causal_gpt_rl.model: autoregressive policy model definitions and JSON-safe state/action specs.causal_gpt_rl.inference.load_runner(...): load an exported bundle into a ready-to-runPolicyRunner.PolicyRunner: step-wise policy execution withreset(...),act(...), andobserve(...).run_episodes(...): small single-environment evaluation helper.export_bundle(...): write a public inference bundle from an in-memory model.convert_legacy_bundle_to_safetensors(...): migrate old.ptbundle weights tosafetensors.
Bundle Format
A deployment bundle is a single directory:
bundle/
model.safetensors
config.json
state_normalizer.safetensors # optional
model.safetensors contains only the model state dict needed for inference.
config.json contains public metadata required to reconstruct the runner:
model config, observation specs, action specs, and context length.
state_normalizer.safetensors is optional and stores state normalization
statistics when the policy expects normalized observations.
The bundle does not include experiment metadata or development-only state.
Installation
Install the runtime dependencies in your environment:
pip install torch transformers safetensors numpy gymnasium
For MuJoCo environments, install the appropriate Gymnasium extras as well:
pip install "gymnasium[mujoco]"
To load bundles directly from Hugging Face Hub, install the Hub extra:
pip install "causal-gpt-rl[hub]"
If you are developing directly from this repository, install it editable:
pip install -e .
Quick Start
import gymnasium as gym
from causal_gpt_rl.inference import load_runner
env = gym.make("HalfCheetah-v5")
runner = load_runner(
"path/to/bundle",
device="cuda", # or "cpu"
kv_cache_max_len=None, # default: 4 * context_length
use_windowed=False, # use cached incremental inference by default
)
obs, _ = env.reset()
runner.reset(obs)
done = False
while not done:
action = runner.act()
obs, reward, terminated, truncated, info = env.step(action)
done = terminated or truncated
if not done:
runner.observe(obs)
The compatibility style runner.act(obs) is also supported. On the first call
after reset(obs), the observation is already in the buffer, so act() is the
cleaner form.
Evaluation Helper
For simple single-environment evaluation:
import gymnasium as gym
from causal_gpt_rl.inference import load_runner, run_episodes
env = gym.make("HalfCheetah-v5")
runner = load_runner("path/to/bundle", device="cuda")
stats = run_episodes(env, runner, num_episodes=5, seed=0)
print(stats["return_mean"], stats["return_std"])
Hugging Face Hub
Hub model repositories should use one repository with per-environment subfolders:
ccnets/causal-gpt-rl/
ant-v5/
model.safetensors
config.json
state_normalizer.safetensors # optional
README.md
Then load the desired environment bundle directly:
from causal_gpt_rl.inference import load_runner_from_hub
runner = load_runner_from_hub(
repo_id="ccnets/causal-gpt-rl",
subfolder="ant-v5",
device="cuda",
)
Root-level bundles are still supported by omitting subfolder.
run_episodes(...) is intentionally single-env only. For vectorized or
batched evaluation, drive PolicyRunner directly with num_envs > 1.
Public API
The stable top-level inference surface is:
from causal_gpt_rl.inference import (
PolicyRunner,
load_runner,
run_episodes,
export_bundle,
convert_legacy_bundle_to_safetensors,
load_runner_from_hub,
)
Lower-level components such as ContextBuffer, ContextCache, and
StateNormalizer remain available from their submodules for advanced use, but
they are not the preferred public entrypoint.
Runtime Notes
load_runner(...)accepts a local bundle path.load_runner_from_hub(...)downloads a Hugging Face Hub model repository and then loads the bundle.- Continuous actions are clipped to the bounds stored in
action_specs. - Discrete actions are decoded to integer environment actions.
- Multi-discrete actions support batched decoding when
num_envs > 1. - Invalid runtime sizes such as non-positive
context_length,num_envs, orkv_cache_max_lenraiseValueError. - When
use_windowed=False, cached incremental inference is used. Whenkv_cache_max_lenis omitted, the default cache cap is4 * context_length. - When
use_windowed=True, the full rolling window is passed each step and the KV cache is not used.
Development Checks
Useful local checks:
python -m compileall -q causal_gpt_rl
python -m unittest discover -s tests
For package build checks:
python -m build
python -m twine check dist/*
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file causal_gpt_rl-0.1.0.tar.gz.
File metadata
- Download URL: causal_gpt_rl-0.1.0.tar.gz
- Upload date:
- Size: 30.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dc769f5b0eb7acef89ed1dd7d1b07a71bfc79f83aad7896660667c3502377484
|
|
| MD5 |
2c5a43ffd730d8b0d2b51705c6dc02a0
|
|
| BLAKE2b-256 |
0ffc09553494c2ff9f3fc5475cc8e80d3f826d8c4cc6a6f71dda55413f3b1d91
|
Provenance
The following attestation bundles were made for causal_gpt_rl-0.1.0.tar.gz:
Publisher:
publish.yml on ccnets-team/causal-gpt-rl
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
causal_gpt_rl-0.1.0.tar.gz -
Subject digest:
dc769f5b0eb7acef89ed1dd7d1b07a71bfc79f83aad7896660667c3502377484 - Sigstore transparency entry: 1522955478
- Sigstore integration time:
-
Permalink:
ccnets-team/causal-gpt-rl@e3b5cb9ed3b03cbc44ace7cc33a7158810004596 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/ccnets-team
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e3b5cb9ed3b03cbc44ace7cc33a7158810004596 -
Trigger Event:
release
-
Statement type:
File details
Details for the file causal_gpt_rl-0.1.0-py3-none-any.whl.
File metadata
- Download URL: causal_gpt_rl-0.1.0-py3-none-any.whl
- Upload date:
- Size: 37.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
35ed55d5e90b345c84457e270076c9ae8915dff06eea5f2f2cef1e663e79f637
|
|
| MD5 |
8aec1e51bcc581647103976e03c8dff1
|
|
| BLAKE2b-256 |
82354b6267f6298dea848f78eb7aee86ac60a58b57b24fafce2022c6908dbbb7
|
Provenance
The following attestation bundles were made for causal_gpt_rl-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on ccnets-team/causal-gpt-rl
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
causal_gpt_rl-0.1.0-py3-none-any.whl -
Subject digest:
35ed55d5e90b345c84457e270076c9ae8915dff06eea5f2f2cef1e663e79f637 - Sigstore transparency entry: 1522955501
- Sigstore integration time:
-
Permalink:
ccnets-team/causal-gpt-rl@e3b5cb9ed3b03cbc44ace7cc33a7158810004596 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/ccnets-team
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e3b5cb9ed3b03cbc44ace7cc33a7158810004596 -
Trigger Event:
release
-
Statement type: