Skip to main content

Robotics Transformer Inference in Tensorflow. RT-1, RT-2, RT-X, PALME.

Project description

Code Coverage

Library for Robotic Transformers. RT-1 and RT-X-1.

Installation:

Requirements: python >= 3.9

Recommended: Using PyPI

pip install robo-transformers

From Source

Clone this repo:

git clone https://github.com/sebbyjp/robo_transformers.git

cd robo_transformers

Use poetry

pip install poetry && poetry config virtualenvs.in-project true

Install dependencies

poetry install

Poetry has installed the dependencies in a virtualenv so we need to activate it.

source .venv/bin/activate

Run RT-1 Inference On Demo Images.

python -m robo_transformers.rt1.rt1_inference

See usage:

You can specify a custom checkpoint path or the model_keys for the three mentioned in the RT-1 paper as well as RT-X.

python -m robo_transformers.rt1.rt1_inference --help

Run Inference Server

The inference server takes care of all the internal state so all you need to specify is an instruction and image. You may also pass in a reward and termination signal. Batching is also supported.

from robo_transformers.inference_server import InferenceServer
import numpy as np

# Somewhere in your robot control stack code...

instruction = "pick block"
img = np.random.randn(256, 320, 3) # Width, Height, RGB
inference = InferenceServer()

action = inference(instruction, img)

Data Types

action, next_policy_state = model.act(time_step, curr_policy_state)

policy state is internal state of network:

In this case it is a 6-frame window of past observations,actions and the index in time.

{'action_tokens': ArraySpec(shape=(6, 11, 1, 1), dtype=dtype('int32'), name='action_tokens'),
 'image': ArraySpec(shape=(6, 256, 320, 3), dtype=dtype('uint8'), name='image'),
 'step_num': ArraySpec(shape=(1, 1, 1, 1), dtype=dtype('int32'), name='step_num'),
 't': ArraySpec(shape=(1, 1, 1, 1), dtype=dtype('int32'), name='t')}

time_step is the input from the environment:

{'discount': BoundedArraySpec(shape=(), dtype=dtype('float32'), name='discount', minimum=0.0, maximum=1.0),
 'observation': {'base_pose_tool_reached': ArraySpec(shape=(7,), dtype=dtype('float32'), name='base_pose_tool_reached'),
                 'gripper_closed': ArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closed'),
                 'gripper_closedness_commanded': ArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closedness_commanded'),
                 'height_to_bottom': ArraySpec(shape=(1,), dtype=dtype('float32'), name='height_to_bottom'),
                 'image': ArraySpec(shape=(256, 320, 3), dtype=dtype('uint8'), name='image'),
                 'natural_language_embedding': ArraySpec(shape=(512,), dtype=dtype('float32'), name='natural_language_embedding'),
                 'natural_language_instruction': ArraySpec(shape=(), dtype=dtype('O'), name='natural_language_instruction'),
                 'orientation_box': ArraySpec(shape=(2, 3), dtype=dtype('float32'), name='orientation_box'),
                 'orientation_start': ArraySpec(shape=(4,), dtype=dtype('float32'), name='orientation_in_camera_space'),
                 'robot_orientation_positions_box': ArraySpec(shape=(3, 3), dtype=dtype('float32'), name='robot_orientation_positions_box'),
                 'rotation_delta_to_go': ArraySpec(shape=(3,), dtype=dtype('float32'), name='rotation_delta_to_go'),
                 'src_rotation': ArraySpec(shape=(4,), dtype=dtype('float32'), name='transform_camera_robot'),
                 'vector_to_go': ArraySpec(shape=(3,), dtype=dtype('float32'), name='vector_to_go'),
                 'workspace_bounds': ArraySpec(shape=(3, 3), dtype=dtype('float32'), name='workspace_bounds')},
 'reward': ArraySpec(shape=(), dtype=dtype('float32'), name='reward'),
 'step_type': ArraySpec(shape=(), dtype=dtype('int32'), name='step_type')}

action:

{'base_displacement_vector': BoundedArraySpec(shape=(2,), dtype=dtype('float32'), name='base_displacement_vector', minimum=-1.0, maximum=1.0),
 'base_displacement_vertical_rotation': BoundedArraySpec(shape=(1,), dtype=dtype('float32'), name='base_displacement_vertical_rotation', minimum=-3.1415927410125732, maximum=3.1415927410125732),
 'gripper_closedness_action': BoundedArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closedness_action', minimum=-1.0, maximum=1.0),
 'rotation_delta': BoundedArraySpec(shape=(3,), dtype=dtype('float32'), name='rotation_delta', minimum=-1.5707963705062866, maximum=1.5707963705062866),
 'terminate_episode': BoundedArraySpec(shape=(3,), dtype=dtype('int32'), name='terminate_episode', minimum=0, maximum=1),
 'world_vector': BoundedArraySpec(shape=(3,), dtype=dtype('float32'), name='world_vector', minimum=-1.0, maximum=1.0)}

TODO:

  • Render action, policy_state, observation specs in something prettier like pandas data frame.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

robo_transformers-0.1.9.tar.gz (4.1 MB view details)

Uploaded Source

Built Distribution

robo_transformers-0.1.9-py3-none-any.whl (4.1 MB view details)

Uploaded Python 3

File details

Details for the file robo_transformers-0.1.9.tar.gz.

File metadata

  • Download URL: robo_transformers-0.1.9.tar.gz
  • Upload date:
  • Size: 4.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.11.6 Darwin/23.1.0

File hashes

Hashes for robo_transformers-0.1.9.tar.gz
Algorithm Hash digest
SHA256 571672631b4c0309f8159d7c41820a5c35b4a40b4d98cfeb27b8b8516f6530a4
MD5 af7c2053a7a2dd7a383307451cfc3286
BLAKE2b-256 ed1026c3de5a851943036f44aed9707d172c558eaf66f5438654e803888e8050

See more details on using hashes here.

File details

Details for the file robo_transformers-0.1.9-py3-none-any.whl.

File metadata

File hashes

Hashes for robo_transformers-0.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 259a3480e7725e8582b36d201c8939cc7263d906d13250e90f6836d1a79a9dfb
MD5 8f164222761213c11603e54c3e241929
BLAKE2b-256 677284e306eb8929f8f6578b9d1f3ad7bc35e33e42193241725e3982b412140e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page