Skip to main content

RPX — wrap any robot training command with full end-to-end analytics.

Project description

robotrainx-agent (im)

Wrap any robot training command with one line and get full end-to-end analytics on robosynx.com.

What it collects automatically — zero code changes to your training script:

Signal How
Live logs (stdout + stderr) Streamed in real-time
RL metrics (reward, KL, loss, entropy…) Parsed from stdout — SB3, RSL-RL, Isaac Lab, CleanRL, generic
Termination reasons & reward components Parsed from Episode_Termination/ and Episode_Reward/ lines
GPU / CPU / RAM nvidia-smi + psutil every 60 s
Checkpoints / artifacts File watcher: .pt, .pkl, .ckpt, .safetensors
Environment snapshot Python version, CUDA, GPU name, git SHA, detected simulator
Heartbeat Every 30 s — run stays marked alive during Isaac Sim loading
Offline buffering Events spooled to disk when backend unreachable; replayed automatically

Install

pip install robotrainx-agent
# Optional extras for richer system telemetry:
pip install "robotrainx-agent[full]"   # adds psutil + PyYAML

Local development:

pip install -e ./robotrainx-agent

Flow 1: pip install + im run (recommended)

3 commands, then training is instrumented:

# 1. Authenticate
im login --api-key YOUR_KEY

# 2. (optional) Initialise project config in your training directory
cd /path/to/your/robot/project
im init

# 3. Wrap your existing training command — nothing else changes
im run -- python train.py --num-envs 1024
im run -- python train_headless.py --agent AnymalC --headless
im run -- bash scripts/train.sh

Auto-detection: --task and --platform are inferred automatically. No need to add any flags unless you want to override.

All flags for im run

--task TEXT          Experiment name (default: inferred from script filename)
--label TEXT         Human-readable label shown in dashboard
--platform TEXT      isaaclab | mujoco | gazebo | custom (auto-detected)
--run-id TEXT        Explicit run ID (auto-generated UUID if omitted)
--tags TEXT          Comma-separated tags
--watch-dir DIR      Extra dir to watch for checkpoints (repeatable)
--no-metrics         Disable stdout metric parsing
--no-sysinfo         Disable GPU/CPU telemetry
--no-artifacts       Disable checkpoint file detection
--batch-size N       Log events per HTTP batch (default: 80)
--flush-interval F   Seconds between log flushes (default: 1.5)

Flow 2: SSH connect (HPC / shared clusters)

For machines where you can't install packages (SLURM, university HPC):

  1. In the RoboProtX dashboard → Remote Hosts → Add SSH host
  2. Paste your SSH key and remote training command
  3. RoboProtX SSHes in, runs your command, streams logs back
  4. Same analytics pipeline — failure intelligence, sim-to-real, promotion gate

Flow 3: Docker self-host (on-prem enterprise)

cp .env.example .env
# Set ROBOTRAINX_API_KEY, ISAACMONITOR_DB_URL, JWT_SECRET in .env
docker compose up -d postgres backend frontend

Then use im run pointed at your local backend:

export ROBOTRAINX_SERVER_URL=http://your-server:3001
im run -- python train.py

Project config (roboprotx.yaml)

im init creates this automatically. You can also write it manually:

project: anymal-locomotion
simulator: isaaclab
server_url: https://api.robosynx.com

watch_dirs:
  - .
  - logs/checkpoints

The agent searches for roboprotx.yaml from your current directory up to the filesystem root.


Environment variables

Variable Aliases Description
ROBOTRAINX_API_KEY IM_API_KEY, ROBOPROTX_API_KEY API key
ROBOTRAINX_SERVER_URL IM_SERVER_URL, ROBOPROTX_SERVER_URL Backend URL

Edge cases handled

  • Backend unreachable at start — runs offline, all events spooled to ~/.robotrainx-agent/spool/, replayed when connection restored
  • Binary / non-UTF-8 output (Isaac Sim OpenGL) — decoded with errors=replace, never crashes
  • Long lines > 8 KB — truncated with ...[truncated] marker
  • Orphan GPU processes on Ctrl+C — kills entire process group (SIGKILL on Linux, taskkill /T on Windows)
  • Command not found — friendly error with PATH hint, exits 127
  • Missing API key on production server — warns clearly with fix instructions
  • SLURM / multi-process training — wrap the srun or torchrun command directly
  • Isaac Sim long startup (5-10 min silent) — heartbeat thread keeps run alive
  • Rate limiting (HTTP 429) — automatic exponential backoff retry
  • 401/403 auth errors — clear message with im login instructions

Supported log formats (auto-parsed)

Framework Detected signal
Stable Baselines3 | rollout/ep_rew_mean | 4.23 | table
RSL-RL / Isaac Lab Learning iteration 100/1000 blocks
CleanRL global_step=51200, episodic_return=4.23
Generic reward=4.23 iteration=100 kl=0.012
IsaacMonitor patch [IsaacMonitor] failure captured: base_contact rate=0.30
Episode labels Episode_Termination/base_contact: 0.30

Publish to PyPI

python -m build ./robotrainx-agent
python -m twine upload robotrainx-agent/dist/*

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rpx_agent-0.2.0.tar.gz (21.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rpx_agent-0.2.0-py3-none-any.whl (25.7 kB view details)

Uploaded Python 3

File details

Details for the file rpx_agent-0.2.0.tar.gz.

File metadata

  • Download URL: rpx_agent-0.2.0.tar.gz
  • Upload date:
  • Size: 21.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rpx_agent-0.2.0.tar.gz
Algorithm Hash digest
SHA256 41c8c715ba1549874cccfeba4d12b1de4ffa85151e7746cf4a5963701e36446e
MD5 266c10840efbbd4c7262f27d59024507
BLAKE2b-256 124e1f1f17d28d041f22e777f236e845e54ced5ab6223fbe7380e2704d024ccd

See more details on using hashes here.

Provenance

The following attestation bundles were made for rpx_agent-0.2.0.tar.gz:

Publisher: publish.yml on ActuallyIR/isaacmonitor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rpx_agent-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: rpx_agent-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 25.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rpx_agent-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7ff8767fde67da113c32887fabbb7694f4399578a0f02bd008e1237f3f7f109d
MD5 e07899323823f0b76088d7efb28f8fa8
BLAKE2b-256 319c8043e4975c9fafaa720c013d4f9f83fde9ea8c0b4479b1bc1b060edc3461

See more details on using hashes here.

Provenance

The following attestation bundles were made for rpx_agent-0.2.0-py3-none-any.whl:

Publisher: publish.yml on ActuallyIR/isaacmonitor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page