Online image augmentation toolkit for robot learning. 44 methods in 9 groups.
Project description
robo-augment
Online image augmentation toolkit for robot learning. 44 methods in 9 functional groups, designed for visuomotor policy training.
Why
Every major robot learning framework uses at most 3 augmentations. RAD (NeurIPS 2020) showed random crop alone gives 5.75x improvement, yet the field stopped there. No pip-installable, online, robotics-aware augmentation library exists.
robo-augment fills this gap with 44 transforms organized into 9 functional groups, per-camera intensity scaling, augmentation symmetry verification, and speed-optimized pure-PyTorch implementation.
See docs/motivation.md for detailed research motivation and per-method evidence.
Install
pip install robo-augment
Quick Start
from roboaug import create_default_pipeline
aug = create_default_pipeline(camera_role="top", scale=0.6)
augmented = aug(image_tensor) # (C, H, W) float32 [0, 1]
Presets
from roboaug.presets import manipulation_preset, minimal_preset, sim2real_preset, navigation_preset
# Full 44-method pipeline for tabletop manipulation
aug = manipulation_preset()
# Multi-camera with per-camera scaling
augs = manipulation_preset(cameras={"top": 0.6, "wrist": 0.85, "side": 1.0})
# DrQ-style minimal (color jitter + crop + noise)
aug = minimal_preset()
# Heavy photometric for sim-to-real transfer
aug = sim2real_preset()
# Navigation-safe (no spatial transforms that break heading)
aug = navigation_preset()
Custom Pipeline
from roboaug import GroupedAugment
from roboaug.transforms import GaussianNoise, MotionBlur, RandomShadow
pipeline = GroupedAugment(groups=[
{"name": "noise", "mode": "exclusive", "p": 0.3, "transforms": [
(GaussianNoise(std=(5, 25)), 3.0),
(MotionBlur(kernel_size=(3, 11)), 2.0),
]},
{"name": "lighting", "mode": "exclusive", "p": 0.3, "transforms": [
(RandomShadow(opacity=(0.2, 0.5)), 1.0),
]},
], camera_scale=0.6)
augmented = pipeline(image_tensor)
9 Functional Groups
| Group | Methods | Mode | Purpose |
|---|---|---|---|
| Photometric | 9 | independent | Lighting, color temperature, exposure |
| Noise | 5 | exclusive | Sensor noise simulation |
| Blur | 5 | exclusive | Motion, defocus, vibration |
| Spatial | 5 | exclusive | Camera pose, lens distortion |
| Occlusion | 5 | exclusive | Partial view obstruction |
| Lighting | 6 | exclusive | Shadows, spotlights, flare |
| Anti-Shortcut | 4 | independent | Prevent landmark shortcuts |
| Color Channel | 4 | exclusive | Channel robustness |
| Compression | 3 | exclusive | Streaming artifacts |
Independent mode: each transform fires with its own probability, multiple can activate per sample.
Exclusive mode: at most one transform fires per sample from the group, preventing unrealistic compounding.
Framework Integration
LeRobot
from roboaug.presets import manipulation_preset
from lerobot.datasets.lerobot_dataset import LeRobotDataset
augs = manipulation_preset(cameras={"top": 0.6, "wrist": 0.85, "side": 1.0})
dataset = LeRobotDataset(..., image_transforms=augs)
Diffusion Policy
from roboaug import create_default_pipeline
aug = create_default_pipeline(camera_role="side")
# Apply in your dataset __getitem__:
obs["image"] = aug(obs["image"])
Design Principles
- Action-aware: spatial transforms bounded to preserve action semantics (max 10° rotation, 10% translation). No horizontal flip (breaks left-right actions).
- Per-camera scaling: different cameras tolerate different augmentation intensity. Top overview cameras get lighter aug than side context cameras.
- Symmetric augmentation: all transforms verified to not shift BatchNorm statistics (< 0.005 mean bias). Shadow, gradient, gamma, vignette all have symmetric brighten/darken modes.
- Pure PyTorch: zero cv2 dependency in the forward path. 44 methods at ~14ms per 480x640 image.
- Grouped sampling: functional groups prevent unrealistic effect compounding while maximizing diversity.
Speed
Benchmarked on Intel Xeon (single core, CPU):
| Configuration | Time per image | Training throughput |
|---|---|---|
| Full 44-method pipeline (side, 1.0x) | ~14ms | ~2+ steps/s (3 cameras, batch 16) |
| Minimal preset (3 methods) | ~1ms | ~3+ steps/s |
| No augmentation | ~0ms | ~3.2 steps/s |
Citation
@software{roboaugment2026,
title={robo-augment: Online Image Augmentation for Robot Learning},
author={Li, Yuxian},
year={2026},
url={https://github.com/Liyux3/robo-augment}
}
License
Apache 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file robo_augment-0.1.0.tar.gz.
File metadata
- Download URL: robo_augment-0.1.0.tar.gz
- Upload date:
- Size: 18.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1f8b3ef9de90aa89d49ee3ab9dcb1e374e06b72bd4f6565eca254641cea8ac51
|
|
| MD5 |
1ee45fe184c591e2d1487d6d6a732ef1
|
|
| BLAKE2b-256 |
d3222b64843a0916ef0bc8a8f6037c3bfec42d77a5b73ed3ce9b217574fcbb0f
|
Provenance
The following attestation bundles were made for robo_augment-0.1.0.tar.gz:
Publisher:
publish.yml on Liyux3/robo-augment
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
robo_augment-0.1.0.tar.gz -
Subject digest:
1f8b3ef9de90aa89d49ee3ab9dcb1e374e06b72bd4f6565eca254641cea8ac51 - Sigstore transparency entry: 2023231812
- Sigstore integration time:
-
Permalink:
Liyux3/robo-augment@acee290ca5c45550aa928a7a7cbd4184af4ba9ff -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/Liyux3
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@acee290ca5c45550aa928a7a7cbd4184af4ba9ff -
Trigger Event:
release
-
Statement type:
File details
Details for the file robo_augment-0.1.0-py3-none-any.whl.
File metadata
- Download URL: robo_augment-0.1.0-py3-none-any.whl
- Upload date:
- Size: 24.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
33bd4846b4de1e46e5b12eece26c47b5145e2d70da134952a9182b5b8d7be1fe
|
|
| MD5 |
3130c8e90ee832e9ba5df71026310f27
|
|
| BLAKE2b-256 |
29eb221cbfbc0048e4ebf417779acadd3b8926961bb0f87c3f6544294cadf614
|
Provenance
The following attestation bundles were made for robo_augment-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on Liyux3/robo-augment
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
robo_augment-0.1.0-py3-none-any.whl -
Subject digest:
33bd4846b4de1e46e5b12eece26c47b5145e2d70da134952a9182b5b8d7be1fe - Sigstore transparency entry: 2023231939
- Sigstore integration time:
-
Permalink:
Liyux3/robo-augment@acee290ca5c45550aa928a7a7cbd4184af4ba9ff -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/Liyux3
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@acee290ca5c45550aa928a7a7cbd4184af4ba9ff -
Trigger Event:
release
-
Statement type: