Skip to main content

Robotics Transformer Inference in Tensorflow. RT-1, RT-2, RT-X, PALME.

Project description

Code Coverage

Library for Robotic Transformers. RT-1 and RT-X-1.

Installation:

Requirements: python >= 3.9

Recommended: Using PyPI

pip install robo-transformers

From Source

Clone this repo:

git clone https://github.com/sebbyjp/robo_transformers.git

cd robo_transformers

Use poetry

pip install poetry && poetry config virtualenvs.in-project true

Install dependencies

poetry install

Poetry has installed the dependencies in a virtualenv so we need to activate it.

source .venv/bin/activate

Run RT-1 Inference On Demo Images.

python -m robo_transformers.rt1.rt1_inference

See usage:

You can specify a custom checkpoint path or the model_keys for the three mentioned in the RT-1 paper as well as RT-X.

python -m robo_transformers.rt1.rt1_inference --help

Run Inference Server

The inference server takes care of all the internal state so all you need to specify is an instruction and image. You may also pass in a reward and termination signal. Batching is also supported.

from robo_transformers.inference_server import InferenceServer
import numpy as np

# Somewhere in your robot control stack code...

instruction = "pick block"
img = np.random.randn(256, 320, 3) # Width, Height, RGB
inference = InferenceServer()

action = inference(instruction, img)

Data Types

action, next_policy_state = model.act(time_step, curr_policy_state)

policy state is internal state of network:

In this case it is a 6-frame window of past observations,actions and the index in time.

{'action_tokens': ArraySpec(shape=(6, 11, 1, 1), dtype=dtype('int32'), name='action_tokens'),
 'image': ArraySpec(shape=(6, 256, 320, 3), dtype=dtype('uint8'), name='image'),
 'step_num': ArraySpec(shape=(1, 1, 1, 1), dtype=dtype('int32'), name='step_num'),
 't': ArraySpec(shape=(1, 1, 1, 1), dtype=dtype('int32'), name='t')}

time_step is the input from the environment:

{'discount': BoundedArraySpec(shape=(), dtype=dtype('float32'), name='discount', minimum=0.0, maximum=1.0),
 'observation': {'base_pose_tool_reached': ArraySpec(shape=(7,), dtype=dtype('float32'), name='base_pose_tool_reached'),
                 'gripper_closed': ArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closed'),
                 'gripper_closedness_commanded': ArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closedness_commanded'),
                 'height_to_bottom': ArraySpec(shape=(1,), dtype=dtype('float32'), name='height_to_bottom'),
                 'image': ArraySpec(shape=(256, 320, 3), dtype=dtype('uint8'), name='image'),
                 'natural_language_embedding': ArraySpec(shape=(512,), dtype=dtype('float32'), name='natural_language_embedding'),
                 'natural_language_instruction': ArraySpec(shape=(), dtype=dtype('O'), name='natural_language_instruction'),
                 'orientation_box': ArraySpec(shape=(2, 3), dtype=dtype('float32'), name='orientation_box'),
                 'orientation_start': ArraySpec(shape=(4,), dtype=dtype('float32'), name='orientation_in_camera_space'),
                 'robot_orientation_positions_box': ArraySpec(shape=(3, 3), dtype=dtype('float32'), name='robot_orientation_positions_box'),
                 'rotation_delta_to_go': ArraySpec(shape=(3,), dtype=dtype('float32'), name='rotation_delta_to_go'),
                 'src_rotation': ArraySpec(shape=(4,), dtype=dtype('float32'), name='transform_camera_robot'),
                 'vector_to_go': ArraySpec(shape=(3,), dtype=dtype('float32'), name='vector_to_go'),
                 'workspace_bounds': ArraySpec(shape=(3, 3), dtype=dtype('float32'), name='workspace_bounds')},
 'reward': ArraySpec(shape=(), dtype=dtype('float32'), name='reward'),
 'step_type': ArraySpec(shape=(), dtype=dtype('int32'), name='step_type')}

action:

{'base_displacement_vector': BoundedArraySpec(shape=(2,), dtype=dtype('float32'), name='base_displacement_vector', minimum=-1.0, maximum=1.0),
 'base_displacement_vertical_rotation': BoundedArraySpec(shape=(1,), dtype=dtype('float32'), name='base_displacement_vertical_rotation', minimum=-3.1415927410125732, maximum=3.1415927410125732),
 'gripper_closedness_action': BoundedArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closedness_action', minimum=-1.0, maximum=1.0),
 'rotation_delta': BoundedArraySpec(shape=(3,), dtype=dtype('float32'), name='rotation_delta', minimum=-1.5707963705062866, maximum=1.5707963705062866),
 'terminate_episode': BoundedArraySpec(shape=(3,), dtype=dtype('int32'), name='terminate_episode', minimum=0, maximum=1),
 'world_vector': BoundedArraySpec(shape=(3,), dtype=dtype('float32'), name='world_vector', minimum=-1.0, maximum=1.0)}

TODO:

  • Render action, policy_state, observation specs in something prettier like pandas data frame.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

robo_transformers-0.1.12.tar.gz (4.1 MB view details)

Uploaded Source

Built Distribution

robo_transformers-0.1.12-py3-none-any.whl (4.1 MB view details)

Uploaded Python 3

File details

Details for the file robo_transformers-0.1.12.tar.gz.

File metadata

  • Download URL: robo_transformers-0.1.12.tar.gz
  • Upload date:
  • Size: 4.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.11.6 Darwin/23.2.0

File hashes

Hashes for robo_transformers-0.1.12.tar.gz
Algorithm Hash digest
SHA256 f6a0ab41958757e3a3161ca4acb50d5720384c00ab4c6d57ff909129b07af938
MD5 48a5daf79d13d99ad2010f29e0ea9ff3
BLAKE2b-256 91abab0c4e7565a159d632ea13c1eea55c4051483c7f37e89c2ebfb3f77ec6c2

See more details on using hashes here.

Provenance

File details

Details for the file robo_transformers-0.1.12-py3-none-any.whl.

File metadata

File hashes

Hashes for robo_transformers-0.1.12-py3-none-any.whl
Algorithm Hash digest
SHA256 12b585c114ab5f11bf723b858dc6402e54cf6753d2488f5811af03820eab8355
MD5 4c7540408a5797a57322a4dec2aa8660
BLAKE2b-256 b03c582ee544717beccdb8a633f15146bbdee1bffd364050d04a53d623a462ae

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page