OpenAI Gym equivalent for loops — create, run, benchmark, compare, evolve
Project description
LoopGym
Run any loop. Three ways. One API.
Compile LSS 1.1 YAML into executable environments — simulate for CI, call live models for production eval, or replay LoopNet trajectories without spending a token.
pip install loopgym
Quickstart · API docs · PyPI · LoopBench · Observability
🚀 The idea in one picture
flowchart TB
SPEC["Your LSS YAML"]
MAKE["loopgym.make(env_id)"]
SIM["SimEnv<br/><i>deterministic · free · CI-safe</i>"]
LIVE["LiveEnv<br/><i>real models · production eval</i>"]
REPLAY["ReplayEnv<br/><i>LoopNet trajectories · zero API cost</i>"]
SPEC --> MAKE
MAKE --> SIM
MAKE --> LIVE
MAKE --> REPLAY
LSS declares the loop. LoopGym runs it. LoopBench scores it. Clean separation — like Gym vs. benchmark suites in reinforcement learning.
Run cost vs fidelity — and everything else you get
Pick the backend that matches your stage. SimEnv and ReplayEnv cost $0; LiveEnv uses real model spend when you need production truth.
| Benefit | SimEnv / Replay | LiveEnv |
|---|---|---|
| API spend | $0 — run all night | Real model cost |
| Determinism | Fixed seeds · CI-safe | Stochastic production |
| LoopBench ready | Submit scores without keys | Production eval |
| LoopNet replay | Replay 545 trajectories offline | N/A |
| Safety / HITL drills | PerturbedSim perturbations | Full stack |
| One API | loopgym.make(env_id) — same code path |
Same |
The unlock: develop, test, benchmark, and regress before you burn tokens in prod.
| Backend | API keys | Best for |
|---|---|---|
| SimEnv | No | CI, LoopBench submissions, local dev |
| ReplayEnv | No | LoopNet trajectory analysis |
| PerturbedSim | No | RAG / HITL / safety perturbations |
| LiveEnv | Yes | Production eval with real LLMs |
⚡ Three backends, one line of code
import loopgym as lg
env = lg.make("loopbench/code-repair-v1")
obs = env.reset(task_id="cr-001")
while not env.done:
action = your_agent.policy(obs)
obs, reward, done, info = env.step(action)
| Backend | When to use | API keys? |
|---|---|---|
| SimEnv | CI, local dev, LoopBench submissions | No |
| LiveEnv | Production eval with real LLMs | OPENAI_API_KEY (pluggable) |
| ReplayEnv | Analyze historical runs from LoopNet | No |
🛠️ Try it in 60 seconds
pip install loopgym
python -c "
import loopgym as lg
env = lg.make('loopbench/code-repair-v1')
obs = env.reset(task_id='cr-001')
print('task:', obs.task_id, '| step:', obs.step)
"
Full quickstart:
git clone https://github.com/KanakMalpani/LoopGym.git && cd LoopGym
pip install -e ".[dev]"
python examples/quickstart.py
pytest tests/ -q
📈 Validate and reproduce
Ran a replay or SimEnv episode? Follow REPRODUCE.md and post on Discussion #10. Export trajectories via loopnet COMMUNITY-SUBMISSION.
🗺️ Environments (v0.1.3)
| Env ID | Backend | Stress-tests / Perturbations |
|---|---|---|
loopbench/code-repair-v1 |
Sim | Verify-driven repair, iteration limits |
loopbench/research-synthesis-v1 |
Sim | Multi-step synthesis + rubric |
loopbench/multi-agent-debate-v1 |
Sim | Role-separated workers + evaluator |
loopbench/composed-swarm-v1 |
Sim | Composed parallel rehearsal (scenario-swarm-rehearsal) — LB-COMP-1 |
loopbench/rag-retrieval-v1 |
Perturbed Sim | RAG retrieval with missing/stale source perturbations — LB-RAG-1 |
loopbench/hitl-gate-v1 |
Perturbed Sim | Human-in-the-loop approval gate simulation (rejections) — LB-HITL-1 |
loopbench/safety-constrained-v1 |
Perturbed Sim | Tool allowlist / denylist safety termination — LB-SAFE-1 |
replay/loopnet-v1 |
Replay | Full trajectories from LoopNet v0.2 |
sim/mock-llm-v1 |
Sim | Generic sandbox for custom LSS specs |
Bundled specs under envs/loopbench/ — validated against Loop Core Engineering in CI.
🎯 Who this is for
| You want to… | LoopGym gives you… |
|---|---|
| Benchmark your loop design | Same env IDs LoopBench uses |
| Test without burning API budget | SimEnv + ReplayEnv |
| Ship production eval pipelines | LiveEnv with pluggable backends |
| Replay production-like runs | ReplayEnv + LoopNet corpus |
| Trace iterations & LES | loopotel LTF export |
👁️ Observability
Trace loop iterations without raw chat logs (LTF 0.1):
pip install loopotel loopgym
python -c "
import loopgym as lg
from loopotel.integrations.loopgym import run_traced_episode
env = lg.make('loopbench/code-repair-v1')
result, trace = run_traced_episode(env, task_id='cr-001', seed=0, enabled=True)
print(result['success'], len(trace['spans']), 'spans')
"
Full stack walkthrough: LoopNet end-to-end tutorial.
⚙️ Ecosystem
| Repo | Role |
|---|---|
| Loop Core Engineering | LSS / LES authority |
| LoopNet | Trajectory corpus |
| LoopGym | Runtime (this repo) |
| LoopBench | Public scoreboard |
| loop-observability | LTF traces (loopotel) |
Stack map: ECOSYSTEM.md
📝 Citation
@software{loopgym2026,
title={LoopGym: OpenAI Gym for LSS-Defined Agent Loops},
author={Malpani, Kanak},
year={2026},
url={https://pypi.org/project/loopgym/}
}
MIT · v0.1.3 · Contributing
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file loopgym-0.1.4.tar.gz.
File metadata
- Download URL: loopgym-0.1.4.tar.gz
- Upload date:
- Size: 81.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
38e0601820f432de6b94b616f7e8f74c958cb8e1f793a9246854acb9b22c5ad2
|
|
| MD5 |
c10e350f9a44db4774b72da4548627d0
|
|
| BLAKE2b-256 |
3bba76a8e472675a27f2636b62f20150c4459c66fa20a55cbf4f1ceb16aa64cd
|
File details
Details for the file loopgym-0.1.4-py3-none-any.whl.
File metadata
- Download URL: loopgym-0.1.4-py3-none-any.whl
- Upload date:
- Size: 46.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
84481e9b8fda12a7fb290bc1df0f6966e6ca52357f7651fb67b54f5043220461
|
|
| MD5 |
17a3b91b42c8189a0bdc50d55a1013da
|
|
| BLAKE2b-256 |
25d404fe361a50aa17787a6671551a32aefc3c9c6d36610b1495e50b106b799a
|