Skip to main content

RPX — wrap any robot training command with full end-to-end analytics.

Project description

rpx-agent

Wrap any robot training command with one line and get full end-to-end analytics on robosynx.com.

What it collects automatically — zero code changes to your training script:

Signal How
Live logs (stdout + stderr) Streamed in real-time
RL metrics (reward, KL, loss, entropy…) Parsed from stdout — SB3, RSL-RL, Isaac Lab, CleanRL, generic
Termination reasons & reward components Parsed from Episode_Termination/ and Episode_Reward/ lines
GPU / CPU / RAM nvidia-smi + psutil every 60 s
Checkpoints / artifacts File watcher: .pt, .pkl, .ckpt, .safetensors
Environment snapshot Python version, CUDA, GPU name, git SHA, detected simulator
Heartbeat Every 30 s — run stays marked alive during Isaac Sim loading
Offline buffering Events spooled to disk when backend unreachable; replayed automatically

Install

pip install rpx-agent
# Optional extras for richer system telemetry:
pip install "rpx-agent[full]"   # adds psutil + PyYAML

Local development:

pip install -e ./robotrainx-agent

Flow 1: pip install + rpx-agent run (recommended)

3 commands, then training is instrumented:

# 1. Authenticate
rpx-agent login --api-key YOUR_KEY

# 2. (optional) Initialise project config in your training directory
cd /path/to/your/robot/project
rpx-agent init

# 3. Wrap your existing training command — nothing else changes
rpx-agent run -- python train.py --num-envs 1024
rpx-agent run -- python train_headless.py --agent AnymalC --headless
rpx-agent run -- bash scripts/train.sh

Auto-detection: --task and --platform are inferred automatically. No need to add any flags unless you want to override.

rpx also works as a short alias: rpx run -- python train.py

All flags for rpx-agent run

--task TEXT          Experiment name (default: inferred from script filename)
--label TEXT         Human-readable label shown in dashboard
--platform TEXT      isaaclab | mujoco | gazebo | custom (auto-detected)
--run-id TEXT        Explicit run ID (auto-generated UUID if omitted)
--tags TEXT          Comma-separated tags
--watch-dir DIR      Extra dir to watch for checkpoints (repeatable)
--no-metrics         Disable stdout metric parsing
--no-sysinfo         Disable GPU/CPU telemetry
--no-artifacts       Disable checkpoint file detection
--batch-size N       Log events per HTTP batch (default: 80)
--flush-interval F   Seconds between log flushes (default: 1.5)

Flow 2: SSH connect (HPC / shared clusters)

For machines where you can't install packages (SLURM, university HPC):

  1. In the RoboProtX dashboard → Remote Hosts → Add SSH host
  2. Paste your SSH key and remote training command
  3. RoboProtX SSHes in, runs your command, streams logs back
  4. Same analytics pipeline — failure intelligence, sim-to-real, promotion gate

Flow 3: Docker self-host (on-prem enterprise)

cp .env.example .env
# Set ROBOTRAINX_API_KEY, ISAACMONITOR_DB_URL, JWT_SECRET in .env
docker compose up -d postgres backend frontend

Then use rpx-agent run pointed at your local backend:

export ROBOTRAINX_SERVER_URL=http://your-server:3001
rpx-agent run -- python train.py

Project config (roboprotx.yaml)

rpx-agent init creates this automatically. You can also write it manually:

project: anymal-locomotion
simulator: isaaclab
server_url: https://api.robosynx.com

watch_dirs:
  - .
  - logs/checkpoints

The agent searches for roboprotx.yaml from your current directory up to the filesystem root.


Environment variables

Variable Aliases Description
ROBOTRAINX_API_KEY IM_API_KEY, ROBOPROTX_API_KEY API key
ROBOTRAINX_SERVER_URL IM_SERVER_URL, ROBOPROTX_SERVER_URL Backend URL

Edge cases handled

  • Backend unreachable at start — runs offline, all events spooled to ~/.robotrainx-agent/spool/, replayed when connection restored
  • Binary / non-UTF-8 output (Isaac Sim OpenGL) — decoded with errors=replace, never crashes
  • Long lines > 8 KB — truncated with ...[truncated] marker
  • Orphan GPU processes on Ctrl+C — kills entire process group (SIGKILL on Linux, taskkill /T on Windows)
  • Command not found — friendly error with PATH hint, exits 127
  • Missing API key on production server — warns clearly with rpx-agent login instructions
  • SLURM / multi-process training — wrap the srun or torchrun command directly
  • Isaac Sim long startup (5-10 min silent) — heartbeat thread keeps run alive
  • Rate limiting (HTTP 429) — automatic exponential backoff retry
  • 401/403 auth errors — clear message with rpx-agent login instructions

Supported log formats (auto-parsed)

Framework Detected signal
Stable Baselines3 | rollout/ep_rew_mean | 4.23 | table
RSL-RL / Isaac Lab Learning iteration 100/1000 blocks
CleanRL global_step=51200, episodic_return=4.23
Generic reward=4.23 iteration=100 kl=0.012
IsaacMonitor / rpx patch [IsaacMonitor] failure captured: base_contact rate=0.30
Episode labels Episode_Termination/base_contact: 0.30

Publish to PyPI

python -m build ./robotrainx-agent
python -m twine upload robotrainx-agent/dist/*

Or just push a version tag — GitHub Actions auto-publishes:

git tag v0.2.1; git push origin v0.2.1

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rpx_agent-0.2.2.tar.gz (27.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rpx_agent-0.2.2-py3-none-any.whl (31.2 kB view details)

Uploaded Python 3

File details

Details for the file rpx_agent-0.2.2.tar.gz.

File metadata

  • Download URL: rpx_agent-0.2.2.tar.gz
  • Upload date:
  • Size: 27.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for rpx_agent-0.2.2.tar.gz
Algorithm Hash digest
SHA256 94b7fec02c0509629a95889484a217d5c3515eed30f115b3e91b0def7cffa5ed
MD5 eb0f05f0ae4d4394df998526b523f78f
BLAKE2b-256 1c16e552d36206e49cdb5f0cc10bb22abaf7dc9d14208efe2c4f9db064a3d5ae

See more details on using hashes here.

File details

Details for the file rpx_agent-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: rpx_agent-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 31.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for rpx_agent-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 64858d2a09ffe1170114ecfbae2664147c5c6254f2c357b8d5e0ed33d0a2eb67
MD5 b3235983584e799b01ceb55742bb3ace
BLAKE2b-256 c7d50fcfe2a6aa804f344f69fedfaed7a7030fac34cd1679ba3bdc1d73dbd888

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page