SB3-like, simulator-agnostic control algorithms over any gymnasium.Env — PID, MPPI/CEM/iCEM, iLQR, CBF, and GPU-native RL (PPO/SAC/TD3) with vectorized on-device training
Project description
tau-ctrl
SB3-like, simulator-agnostic control algorithms — feedback, sampling-based MPC, safety filtering, and GPU-native RL behind one interface.
Like Stable-Baselines3, but for the whole controller spectrum, over any gymnasium.Env. No simulator dependency: works with whatever hands you an env (e.g. tau-sim). Unlike SB3, the RL methods train on-device and scale to vectorized environments for real GPU speedup.
Installation
pip install tau-ctrl # PID, MPPI/CEM/iCEM, iLQR, CBF (numpy + scipy + gymnasium)
pip install tau-ctrl[torch] # + RL: PPO, SAC, TD3, and vectorized on-device training
Usage
from tau_ctrl import make
ctrl = make("mppi", env, horizon=25, n_samples=300) # or MPPI(env, ...), SAC(env), ...
action, _ = ctrl.predict(obs) # SB3-style
ctrl.learn(total_timesteps=100_000) # trainable methods (ppo/sac/td3)
ctrl.save("ctrl.pkl")
Algorithms
| Method | Family | Needs | Notes |
|---|---|---|---|
pid |
feedback | obs only | independent PID/PD over selected obs indices |
mppi |
sampling MPC | get_state/set_state |
Model-Predictive Path Integral; plans against the env's own reward; noise_beta>0 for smoother ("colored-noise") torques |
cem |
sampling MPC | get_state/set_state |
Cross-Entropy Method MPC |
icem |
sampling MPC | get_state/set_state |
Improved CEM — colored noise + elite memory across iterations |
ilqr |
gradient MPC | get_state/set_state |
Iterative LQR via finite-difference linearization; fast, precise convergence on smooth dynamics |
cbf |
safety filter | get_state/set_state |
wraps any base controller, projects its action to keep h(x) >= 0 |
ppo |
RL (on-policy) | torch | GPU-automatic via device="auto" |
sac |
RL (off-policy) | torch | replay buffer + twin critics + auto entropy tuning; far more sample-efficient than PPO |
td3 |
RL (off-policy) | torch | replay buffer + twin critics + delayed policy updates + target smoothing |
Model-based methods (mppi/cem/icem/ilqr/cbf) need the env to be branchable
(expose get_state() / set_state()) so they can roll candidate sequences
forward without disturbing the live episode. All torch-based methods
auto-select cuda when available (device="auto", including reproducible
seeding of torch's RNG).
GPU-native RL: vectorized, on-device training
SB3 steps CPU environments and the policy update sits behind per-step Python.
tau-ctrl's SAC/TD3 instead run the replay buffer and the update on the target
device, and — given a batched env — step thousands of environments in parallel
with no numpy in the hot loop. Trainer.auto probes your env and hardware and
picks the fastest correct strategy, so the same call adapts across the whole
env-reality spectrum:
import gymnasium as gym
from tau_ctrl import Trainer
# One line: probes env + hardware, wraps as needed, trains on the best engine.
model = Trainer.auto("sac", env=gym.make_vec("HalfCheetah-v4", num_envs=64),
total_timesteps=1_000_000)
action, _ = model.predict(obs)
| You have | Adapter | GPU helps |
|---|---|---|
native TorchVecEnv (or MJX/Brax via jax_to_torch, Isaac Gym) |
— (fits directly) | env and update on-device — the real win |
gymnasium.vector.VectorEnv |
GymVectorAdapter |
buffer + update on device (env stepping stays CPU) |
| a single, non-batchable env (PyBullet, classic MuJoCo) + a factory | SyncTorchVecEnv |
batched update on device |
| one env you can't replicate (a real robot) | — (single-env path) | only the update |
See benchmarks/RESULTS.md for head-to-head numbers vs
Stable-Baselines3 and skrl, and examples/ for runnable scripts
(quickstart.py, vectorized_rl.py, adaptive_training.py).
Safety filtering
# CBF wraps any base controller and only intervenes when a barrier is at risk
from tau_ctrl import CBFFilter, make
base = make("pid", env, kp=8.0, kd=5.0, target=[0.0], q_idx=[0], dq_idx=[1])
safe = CBFFilter(env, base=base, barriers=lambda state: v_max - state[1], alpha=0.5)
action, _ = safe.predict(obs)
Auto-tuning
from tau_ctrl import AutoTuner
tuner = AutoTuner({"kp": (1, 500), "kd": (0.1, 50)}, method="bayesian", n_iterations=50)
result = tuner.tune(cost_fn) # cost_fn: dict of params -> scalar cost
print(result["best_params"])
Layout
src/tau_ctrl/
├── algorithms/ # base.py (interface + registry), off_policy.py (shared SAC/TD3 infra),
│ │ # pid.py, mppi.py (MPPI/CEM/ICEM), ilqr.py, cbf.py, ppo.py, sac.py, td3.py
│ │ # vec_env.py (TorchVecEnv), adapters.py, strategy.py (Trainer.auto)
│ └── envs/ # pure-Python toy envs for tests/examples
└── tuning/ # Bayesian & genetic auto-tuning
License
Apache 2.0 — see LICENSE.
Related
- tau-sim — robotics environment builder
- Stable-Baselines3 — RL algorithms (API inspiration)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tau_ctrl-0.1.0.tar.gz.
File metadata
- Download URL: tau_ctrl-0.1.0.tar.gz
- Upload date:
- Size: 41.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dbd676b7147fbd7e6a92068e36d0f4fada612990e79e98377ca56a4571b076ac
|
|
| MD5 |
30b369c171742e70a05084553e7c0099
|
|
| BLAKE2b-256 |
928d34aba4a9471bd370dace48df038f6bf265c0ed211418162bfb93362c0520
|
File details
Details for the file tau_ctrl-0.1.0-py3-none-any.whl.
File metadata
- Download URL: tau_ctrl-0.1.0-py3-none-any.whl
- Upload date:
- Size: 50.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b6bf7025c3a2d97743bda3cf88cb41256f2c6fc11ac87be8f63828487ce78904
|
|
| MD5 |
99eed31e4baa05e7d30f1ee84e16130e
|
|
| BLAKE2b-256 |
f20a7d339bd6ab4f7195c36e8f6f3449464b816795a6d6c8a844bc3bc18147b2
|