Machine learning models optimized for robotics experimentation and deployment
Project description
RoboML is an aggregator package for quickly deploying open-source ML models for robots. It supports three main use cases:
- Rapid deployment of general-purpose models: Wraps around popular ML libraries like 🤗 Transformers, allowing fast deployment of models through scalable server endpoints.
- Deploy detection models with tracking: Supports deployment of all detection models in MMDetection with optional tracking integration.
- Aggregate robot-specific models from the robotics community: Intended as a platform for community-contributed multimodal models, usable in planning and control, especially with ROS components. See EmbodiedAgents.
Models And Wrappers
| Model Class | Description | Default Checkpoint / Resource | Key Init Parameters |
|---|---|---|---|
TransformersLLM |
General-purpose large language model (LLM) from 🤗 Transformers | microsoft/Phi-3-mini-4k-instruct |
name, checkpoint, quantization, init_timeout |
TransformersMLLM |
Multimodal vision-language model (MLLM) from 🤗 Transformers | HuggingFaceM4/idefics2-8b |
name, checkpoint, quantization, init_timeout |
RoboBrain2 |
Embodied planning + multimodal reasoning via RoboBrain 2.0 | BAAI/RoboBrain2.0-7B |
name, checkpoint, init_timeout |
Whisper |
Multilingual speech-to-text (ASR) from OpenAI Whisper | small.en (checkpoint list) |
name, checkpoint, compute_type, init_timeout |
SpeechT5 |
Text-to-speech model from Microsoft SpeechT5 | microsoft/speecht5_tts |
name, checkpoint, voice, init_timeout |
Bark |
Text-to-speech model from SunoAI Bark | suno/bark-small, voice options |
name, checkpoint, voice, attn_implementation, init_timeout |
MeloTTS |
Multilingual text-to-speech via MeloTTS | EN, EN-US |
name, language, speaker_id, init_timeout |
VisionModel |
Detection + tracking via MMDetection | dino-4scale_r50_8xb2-12e_coco |
name, checkpoint, setup_trackers, cache_dir, tracking_distance_function, tracking_distance_threshold, deploy_tensorrt, _num_trackers, init_timeout |
Installation
RoboML has been tested on Ubuntu 20.04 and later. A GPU with CUDA 12.1+ is recommended. If you encounter issues, please open an issue.
pip install roboml
From Source
git clone https://github.com/automatika-robotics/roboml.git && cd roboml
virtualenv venv && source venv/bin/activate
pip install pip-tools
pip install .
Vision Model Support
To use detection and tracking features via MMDetection:
-
Install RoboML with the vision extras:
pip install roboml[vision]
-
Install
mmcvusing the appropriate CUDA and PyTorch versions as described in their docs. Example for PyTorch 2.1 with CUDA 12.1:pip install mmcv==2.1.0 -f https://download.openmmlab.com/mmcv/dist/cu121/torch2.1/index.html
-
Install
mmdetection:git clone https://github.com/open-mmlab/mmdetection.git cd mmdetection pip install -v -e .
-
If
ffmpegorlibGLis missing:sudo apt-get update && apt-get install ffmpeg libsm6 libxext6
TensorRT-Based Model Deployment
RoboML vision models can optionally be accelerated with NVIDIA TensorRT on Linux x86_64 systems. For setup, follow the TensorRT installation guide.
Docker Build (Recommended)
Jetson users are especially encouraged to use Docker.
- Install Docker Desktop
- Install the NVIDIA Container Toolkit
git clone https://github.com/automatika-robotics/roboml.git && cd roboml
# Build container image
docker build --tag=automatika:roboml .
# For Jetson boards:
docker build --tag=automatika:roboml -f Dockerfile.Jetson .
# Run HTTP server
docker run --runtime=nvidia --gpus all --rm -p 8000:8000 automatika:roboml roboml
# Or run RESP server
docker run --runtime=nvidia --gpus all --rm -p 6379:6379 automatika:roboml roboml-resp
-
(Optional) Mount your cache dir to persist downloaded models:
-v ~/.cache:/root/.cache
Servers
RoboML uses Ray Serve to host models as scalable apps across various environments.
WebSocket Endpoint
WebSocket endpoints are exposed for streaming use cases (e.g., STT/TTS).
Experimental RESP Server
For ultra-low latency in robotics, RoboML also includes a RESP-based server compatible with any Redis client.
RESP (see spec) is a lightweight, binary-safe protocol. Combined with msgpack instead of JSON, it enables very fast I/O, ideal for binary data like images, audio, or video.
This work is inspired by @hansonkd’s Tino project.
Usage
Run the HTTP server:
roboml
Run the RESP server:
roboml-resp
Example usage in ROS clients is documented in ROS Agents.
Running Tests
Install dev dependencies:
pip install ".[dev]"
Run tests from the project root:
python -m pytest
Copyright
Unless otherwise specified, all code is © 2024 Automatika Robotics. RoboML is released under the MIT License. See LICENSE for details.
Contributions
ROS Agents is developed in collaboration between Automatika Robotics and Inria. Community contributions are welcome!
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file roboml-0.3.1.tar.gz.
File metadata
- Download URL: roboml-0.3.1.tar.gz
- Upload date:
- Size: 34.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
374aba7f73fe9860d12aeea9e6cfcbde63c2e6d6bb7696ed5c692487d7387adb
|
|
| MD5 |
99de56f1e2647f9928d58bab75ce3720
|
|
| BLAKE2b-256 |
f086f9b4f2f7129e0d9faf37e5a093a17d2bb2b0edef027b58c3924d7d10ebbf
|
Provenance
The following attestation bundles were made for roboml-0.3.1.tar.gz:
Publisher:
build_and_publish.yml on automatika-robotics/roboml
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
roboml-0.3.1.tar.gz -
Subject digest:
374aba7f73fe9860d12aeea9e6cfcbde63c2e6d6bb7696ed5c692487d7387adb - Sigstore transparency entry: 269972359
- Sigstore integration time:
-
Permalink:
automatika-robotics/roboml@1441a813827121b5baa9f1678c408537ed5c644c -
Branch / Tag:
refs/tags/0.3.1 - Owner: https://github.com/automatika-robotics
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build_and_publish.yml@1441a813827121b5baa9f1678c408537ed5c644c -
Trigger Event:
release
-
Statement type:
File details
Details for the file roboml-0.3.1-py3-none-any.whl.
File metadata
- Download URL: roboml-0.3.1-py3-none-any.whl
- Upload date:
- Size: 35.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8d6b41292f2796b2f4efda58bd46684d5b430efe481e89a5bee50e9383b5b1cb
|
|
| MD5 |
8d99cf253bd7ea63987cd0459b5a15c5
|
|
| BLAKE2b-256 |
5f5c38c5b7b8e7181fb29d68530f50c012c32a62d6222e9e7dc1c23c59d92d8d
|
Provenance
The following attestation bundles were made for roboml-0.3.1-py3-none-any.whl:
Publisher:
build_and_publish.yml on automatika-robotics/roboml
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
roboml-0.3.1-py3-none-any.whl -
Subject digest:
8d6b41292f2796b2f4efda58bd46684d5b430efe481e89a5bee50e9383b5b1cb - Sigstore transparency entry: 269972366
- Sigstore integration time:
-
Permalink:
automatika-robotics/roboml@1441a813827121b5baa9f1678c408537ed5c644c -
Branch / Tag:
refs/tags/0.3.1 - Owner: https://github.com/automatika-robotics
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build_and_publish.yml@1441a813827121b5baa9f1678c408537ed5c644c -
Trigger Event:
release
-
Statement type: