Skip to main content

Robotics Transformer Inference in Tensorflow. RT-1, RT-2, RT-X, PALME.

Project description

Code Coverage

Library for Robotic Transformers. RT-1 and RT-X-1.

Installation:

Requirements: python >= 3.9

Recommended: Using PyPI

pip install robo-transformers

From Source

Clone this repo:

git clone https://github.com/sebbyjp/robo_transformers.git

cd robo_transformers

Use poetry

pip install poetry && poetry config virtualenvs.in-project true

Install dependencies

poetry install

Poetry has installed the dependencies in a virtualenv so we need to activate it.

source .venv/bin/activate

Run RT-1 Inference On Demo Images.

python -m robo_transformers.rt1.rt1_inference

See usage:

You can specify a custom checkpoint path or the model_keys for the three mentioned in the RT-1 paper as well as RT-X.

python -m robo_transformers.rt1.rt1_inference --help

Run Inference Server

The inference server takes care of all the internal state so all you need to specify is an instruction and image. You may also pass in a reward and termination signal. Batching is also supported.

from robo_transformers.inference_server import InferenceServer
import numpy as np

# Somewhere in your robot control stack code...

instruction = "pick block"
img = np.random.randn(256, 320, 3) # Width, Height, RGB
inference = InferenceServer()

action = inference(instruction, img)

Data Types

action, next_policy_state = model.act(time_step, curr_policy_state)

policy state is internal state of network:

In this case it is a 6-frame window of past observations,actions and the index in time.

{'action_tokens': ArraySpec(shape=(6, 11, 1, 1), dtype=dtype('int32'), name='action_tokens'),
 'image': ArraySpec(shape=(6, 256, 320, 3), dtype=dtype('uint8'), name='image'),
 'step_num': ArraySpec(shape=(1, 1, 1, 1), dtype=dtype('int32'), name='step_num'),
 't': ArraySpec(shape=(1, 1, 1, 1), dtype=dtype('int32'), name='t')}

time_step is the input from the environment:

{'discount': BoundedArraySpec(shape=(), dtype=dtype('float32'), name='discount', minimum=0.0, maximum=1.0),
 'observation': {'base_pose_tool_reached': ArraySpec(shape=(7,), dtype=dtype('float32'), name='base_pose_tool_reached'),
                 'gripper_closed': ArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closed'),
                 'gripper_closedness_commanded': ArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closedness_commanded'),
                 'height_to_bottom': ArraySpec(shape=(1,), dtype=dtype('float32'), name='height_to_bottom'),
                 'image': ArraySpec(shape=(256, 320, 3), dtype=dtype('uint8'), name='image'),
                 'natural_language_embedding': ArraySpec(shape=(512,), dtype=dtype('float32'), name='natural_language_embedding'),
                 'natural_language_instruction': ArraySpec(shape=(), dtype=dtype('O'), name='natural_language_instruction'),
                 'orientation_box': ArraySpec(shape=(2, 3), dtype=dtype('float32'), name='orientation_box'),
                 'orientation_start': ArraySpec(shape=(4,), dtype=dtype('float32'), name='orientation_in_camera_space'),
                 'robot_orientation_positions_box': ArraySpec(shape=(3, 3), dtype=dtype('float32'), name='robot_orientation_positions_box'),
                 'rotation_delta_to_go': ArraySpec(shape=(3,), dtype=dtype('float32'), name='rotation_delta_to_go'),
                 'src_rotation': ArraySpec(shape=(4,), dtype=dtype('float32'), name='transform_camera_robot'),
                 'vector_to_go': ArraySpec(shape=(3,), dtype=dtype('float32'), name='vector_to_go'),
                 'workspace_bounds': ArraySpec(shape=(3, 3), dtype=dtype('float32'), name='workspace_bounds')},
 'reward': ArraySpec(shape=(), dtype=dtype('float32'), name='reward'),
 'step_type': ArraySpec(shape=(), dtype=dtype('int32'), name='step_type')}

action:

{'base_displacement_vector': BoundedArraySpec(shape=(2,), dtype=dtype('float32'), name='base_displacement_vector', minimum=-1.0, maximum=1.0),
 'base_displacement_vertical_rotation': BoundedArraySpec(shape=(1,), dtype=dtype('float32'), name='base_displacement_vertical_rotation', minimum=-3.1415927410125732, maximum=3.1415927410125732),
 'gripper_closedness_action': BoundedArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closedness_action', minimum=-1.0, maximum=1.0),
 'rotation_delta': BoundedArraySpec(shape=(3,), dtype=dtype('float32'), name='rotation_delta', minimum=-1.5707963705062866, maximum=1.5707963705062866),
 'terminate_episode': BoundedArraySpec(shape=(3,), dtype=dtype('int32'), name='terminate_episode', minimum=0, maximum=1),
 'world_vector': BoundedArraySpec(shape=(3,), dtype=dtype('float32'), name='world_vector', minimum=-1.0, maximum=1.0)}

TODO:

  • Render action, policy_state, observation specs in something prettier like pandas data frame.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

robo_transformers-0.1.13.tar.gz (4.1 MB view details)

Uploaded Source

Built Distribution

robo_transformers-0.1.13-py3-none-any.whl (4.1 MB view details)

Uploaded Python 3

File details

Details for the file robo_transformers-0.1.13.tar.gz.

File metadata

  • Download URL: robo_transformers-0.1.13.tar.gz
  • Upload date:
  • Size: 4.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.11.6 Darwin/23.2.0

File hashes

Hashes for robo_transformers-0.1.13.tar.gz
Algorithm Hash digest
SHA256 e75ceddc0e6a809be60333a6123575663ab2969309b1e9888c823becd12f3b82
MD5 9fe733af15c9a7af2e7c19819e86b3d6
BLAKE2b-256 3bb5098279f014063402464fbb856b5fd2b397cfada94143d105f2b46bca5c8e

See more details on using hashes here.

File details

Details for the file robo_transformers-0.1.13-py3-none-any.whl.

File metadata

File hashes

Hashes for robo_transformers-0.1.13-py3-none-any.whl
Algorithm Hash digest
SHA256 5ca7d85537794b0bf18c446c678b453bddc9f656f8029520d917131d578627c6
MD5 673062cf81dd390cafb8eafa7b82ebbd
BLAKE2b-256 7c3af9d8ae3a2237cecb16f57cdf146956bb711d64d4dbfa9a165134e6429ab7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page