Skip to main content

Robotics Transformer Inference in Tensorflow. RT-1, RT-2, RT-X, PALME.

Project description

Code Coverage

Library for Robotic Transformers. RT-1 and RT-X-1.

Installation:

Requirements: python >= 3.9

Recommended: Using PyPI

pip install robo-transformers

From Source

Clone this repo:

git clone https://github.com/sebbyjp/robo_transformers.git

cd robo_transformers

Use poetry

pip install poetry && poetry config virtualenvs.in-project true

Install dependencies

poetry install

Poetry has installed the dependencies in a virtualenv so we need to activate it.

source .venv/bin/activate

Run RT-1 Inference On Demo Images.

python -m robo_transformers.rt1.rt1_inference

See usage:

You can specify a custom checkpoint path or the model_keys for the three mentioned in the RT-1 paper as well as RT-X.

python -m robo_transformers.rt1.rt1_inference --help

Run Inference Server

The inference server takes care of all the internal state so all you need to specify is an instruction and image. You may also pass in a reward and termination signal. Batching is also supported.

from robo_transformers.inference_server import InferenceServer
import numpy as np

# Somewhere in your robot control stack code...

instruction = "pick block"
img = np.random.randn(256, 320, 3) # Width, Height, RGB
inference = InferenceServer()

action = inference(instruction, img)

Data Types

action, next_policy_state = model.act(time_step, curr_policy_state)

policy state is internal state of network:

In this case it is a 6-frame window of past observations,actions and the index in time.

{'action_tokens': ArraySpec(shape=(6, 11, 1, 1), dtype=dtype('int32'), name='action_tokens'),
 'image': ArraySpec(shape=(6, 256, 320, 3), dtype=dtype('uint8'), name='image'),
 'step_num': ArraySpec(shape=(1, 1, 1, 1), dtype=dtype('int32'), name='step_num'),
 't': ArraySpec(shape=(1, 1, 1, 1), dtype=dtype('int32'), name='t')}

time_step is the input from the environment:

{'discount': BoundedArraySpec(shape=(), dtype=dtype('float32'), name='discount', minimum=0.0, maximum=1.0),
 'observation': {'base_pose_tool_reached': ArraySpec(shape=(7,), dtype=dtype('float32'), name='base_pose_tool_reached'),
                 'gripper_closed': ArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closed'),
                 'gripper_closedness_commanded': ArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closedness_commanded'),
                 'height_to_bottom': ArraySpec(shape=(1,), dtype=dtype('float32'), name='height_to_bottom'),
                 'image': ArraySpec(shape=(256, 320, 3), dtype=dtype('uint8'), name='image'),
                 'natural_language_embedding': ArraySpec(shape=(512,), dtype=dtype('float32'), name='natural_language_embedding'),
                 'natural_language_instruction': ArraySpec(shape=(), dtype=dtype('O'), name='natural_language_instruction'),
                 'orientation_box': ArraySpec(shape=(2, 3), dtype=dtype('float32'), name='orientation_box'),
                 'orientation_start': ArraySpec(shape=(4,), dtype=dtype('float32'), name='orientation_in_camera_space'),
                 'robot_orientation_positions_box': ArraySpec(shape=(3, 3), dtype=dtype('float32'), name='robot_orientation_positions_box'),
                 'rotation_delta_to_go': ArraySpec(shape=(3,), dtype=dtype('float32'), name='rotation_delta_to_go'),
                 'src_rotation': ArraySpec(shape=(4,), dtype=dtype('float32'), name='transform_camera_robot'),
                 'vector_to_go': ArraySpec(shape=(3,), dtype=dtype('float32'), name='vector_to_go'),
                 'workspace_bounds': ArraySpec(shape=(3, 3), dtype=dtype('float32'), name='workspace_bounds')},
 'reward': ArraySpec(shape=(), dtype=dtype('float32'), name='reward'),
 'step_type': ArraySpec(shape=(), dtype=dtype('int32'), name='step_type')}

action:

{'base_displacement_vector': BoundedArraySpec(shape=(2,), dtype=dtype('float32'), name='base_displacement_vector', minimum=-1.0, maximum=1.0),
 'base_displacement_vertical_rotation': BoundedArraySpec(shape=(1,), dtype=dtype('float32'), name='base_displacement_vertical_rotation', minimum=-3.1415927410125732, maximum=3.1415927410125732),
 'gripper_closedness_action': BoundedArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closedness_action', minimum=-1.0, maximum=1.0),
 'rotation_delta': BoundedArraySpec(shape=(3,), dtype=dtype('float32'), name='rotation_delta', minimum=-1.5707963705062866, maximum=1.5707963705062866),
 'terminate_episode': BoundedArraySpec(shape=(3,), dtype=dtype('int32'), name='terminate_episode', minimum=0, maximum=1),
 'world_vector': BoundedArraySpec(shape=(3,), dtype=dtype('float32'), name='world_vector', minimum=-1.0, maximum=1.0)}

TODO:

  • Render action, policy_state, observation specs in something prettier like pandas data frame.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

robo_transformers-0.1.10.tar.gz (4.1 MB view details)

Uploaded Source

Built Distribution

robo_transformers-0.1.10-py3-none-any.whl (4.1 MB view details)

Uploaded Python 3

File details

Details for the file robo_transformers-0.1.10.tar.gz.

File metadata

  • Download URL: robo_transformers-0.1.10.tar.gz
  • Upload date:
  • Size: 4.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.11.6 Darwin/23.1.0

File hashes

Hashes for robo_transformers-0.1.10.tar.gz
Algorithm Hash digest
SHA256 6c34a2d872d44a0485f51afaefd00ddc58bbb5eb59a38ad4e399868b78200723
MD5 9e87943c0d0ce609e08cdd8d7180ef83
BLAKE2b-256 a6d411f07c39e58508615abae47dad9dff1deb667178a55bf1cffae2ada28087

See more details on using hashes here.

Provenance

File details

Details for the file robo_transformers-0.1.10-py3-none-any.whl.

File metadata

File hashes

Hashes for robo_transformers-0.1.10-py3-none-any.whl
Algorithm Hash digest
SHA256 3c3f2904bbc10a30862a5456b43c29ad7fe355893eb24489f735362356f09707
MD5 da31464f0ecdd65ac783e6c3a726bb31
BLAKE2b-256 9050441a77ac3c9ad262d0002e28aa1b6a152b44ab7d6afde7b2462f2b8391e6

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page