Robotics Transformer Inference in Tensorflow. RT-1, RT-2, RT-X, PALME.
Project description
Library for Robotic Transformers. RT-1 and RT-X-1.
Installation:
Requirements: python >= 3.9
Recommended: Using PyPI
pip install robo-transformers
From Source
Clone this repo:
git clone https://github.com/sebbyjp/robo_transformers.git
cd robo_transformers
Use poetry
pip install poetry && poetry config virtualenvs.in-project true
Install dependencies
poetry install
Poetry has installed the dependencies in a virtualenv so we need to activate it.
source .venv/bin/activate
Run RT-1 Inference On Demo Images.
python -m robo_transformers.rt1.rt1_inference
See usage:
You can specify a custom checkpoint path or the model_keys for the three mentioned in the RT-1 paper as well as RT-X.
python -m robo_transformers.rt1.rt1_inference --help
Run Inference Server
The inference server takes care of all the internal state so all you need to specify is an instruction and image. You may also pass in a reward and termination signal. Batching is also supported.
from robo_transformers.inference_server import InferenceServer
import numpy as np
# Somewhere in your robot control stack code...
instruction = "pick block"
img = np.random.randn(256, 320, 3) # Width, Height, RGB
inference = InferenceServer()
action = inference(instruction, img)
Data Types
action, next_policy_state = model.act(time_step, curr_policy_state)
policy state is internal state of network:
In this case it is a 6-frame window of past observations,actions and the index in time.
{'action_tokens': ArraySpec(shape=(6, 11, 1, 1), dtype=dtype('int32'), name='action_tokens'),
'image': ArraySpec(shape=(6, 256, 320, 3), dtype=dtype('uint8'), name='image'),
'step_num': ArraySpec(shape=(1, 1, 1, 1), dtype=dtype('int32'), name='step_num'),
't': ArraySpec(shape=(1, 1, 1, 1), dtype=dtype('int32'), name='t')}
time_step is the input from the environment:
{'discount': BoundedArraySpec(shape=(), dtype=dtype('float32'), name='discount', minimum=0.0, maximum=1.0),
'observation': {'base_pose_tool_reached': ArraySpec(shape=(7,), dtype=dtype('float32'), name='base_pose_tool_reached'),
'gripper_closed': ArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closed'),
'gripper_closedness_commanded': ArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closedness_commanded'),
'height_to_bottom': ArraySpec(shape=(1,), dtype=dtype('float32'), name='height_to_bottom'),
'image': ArraySpec(shape=(256, 320, 3), dtype=dtype('uint8'), name='image'),
'natural_language_embedding': ArraySpec(shape=(512,), dtype=dtype('float32'), name='natural_language_embedding'),
'natural_language_instruction': ArraySpec(shape=(), dtype=dtype('O'), name='natural_language_instruction'),
'orientation_box': ArraySpec(shape=(2, 3), dtype=dtype('float32'), name='orientation_box'),
'orientation_start': ArraySpec(shape=(4,), dtype=dtype('float32'), name='orientation_in_camera_space'),
'robot_orientation_positions_box': ArraySpec(shape=(3, 3), dtype=dtype('float32'), name='robot_orientation_positions_box'),
'rotation_delta_to_go': ArraySpec(shape=(3,), dtype=dtype('float32'), name='rotation_delta_to_go'),
'src_rotation': ArraySpec(shape=(4,), dtype=dtype('float32'), name='transform_camera_robot'),
'vector_to_go': ArraySpec(shape=(3,), dtype=dtype('float32'), name='vector_to_go'),
'workspace_bounds': ArraySpec(shape=(3, 3), dtype=dtype('float32'), name='workspace_bounds')},
'reward': ArraySpec(shape=(), dtype=dtype('float32'), name='reward'),
'step_type': ArraySpec(shape=(), dtype=dtype('int32'), name='step_type')}
action:
{'base_displacement_vector': BoundedArraySpec(shape=(2,), dtype=dtype('float32'), name='base_displacement_vector', minimum=-1.0, maximum=1.0),
'base_displacement_vertical_rotation': BoundedArraySpec(shape=(1,), dtype=dtype('float32'), name='base_displacement_vertical_rotation', minimum=-3.1415927410125732, maximum=3.1415927410125732),
'gripper_closedness_action': BoundedArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closedness_action', minimum=-1.0, maximum=1.0),
'rotation_delta': BoundedArraySpec(shape=(3,), dtype=dtype('float32'), name='rotation_delta', minimum=-1.5707963705062866, maximum=1.5707963705062866),
'terminate_episode': BoundedArraySpec(shape=(3,), dtype=dtype('int32'), name='terminate_episode', minimum=0, maximum=1),
'world_vector': BoundedArraySpec(shape=(3,), dtype=dtype('float32'), name='world_vector', minimum=-1.0, maximum=1.0)}
TODO:
- Render action, policy_state, observation specs in something prettier like pandas data frame.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file robo_transformers-0.1.12.tar.gz
.
File metadata
- Download URL: robo_transformers-0.1.12.tar.gz
- Upload date:
- Size: 4.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.11.6 Darwin/23.2.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f6a0ab41958757e3a3161ca4acb50d5720384c00ab4c6d57ff909129b07af938 |
|
MD5 | 48a5daf79d13d99ad2010f29e0ea9ff3 |
|
BLAKE2b-256 | 91abab0c4e7565a159d632ea13c1eea55c4051483c7f37e89c2ebfb3f77ec6c2 |
Provenance
File details
Details for the file robo_transformers-0.1.12-py3-none-any.whl
.
File metadata
- Download URL: robo_transformers-0.1.12-py3-none-any.whl
- Upload date:
- Size: 4.1 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.11.6 Darwin/23.2.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 12b585c114ab5f11bf723b858dc6402e54cf6753d2488f5811af03820eab8355 |
|
MD5 | 4c7540408a5797a57322a4dec2aa8660 |
|
BLAKE2b-256 | b03c582ee544717beccdb8a633f15146bbdee1bffd364050d04a53d623a462ae |