Skip to main content

a robot teleoperation server for XR headsets

Project description

XR Robot Teleop Server

PyPI version Python package

A high-performance Python server designed to stream 360° panoramic video and full-body tracking data to XR headsets (like Quest 3, Vision Pro) for immersive robot teleoperation. It uses WebRTC for low-latency communication and features a modular architecture for easy customization.

Key Features

  • Low-Latency Streaming: Built on aiortc and FastAPI for efficient WebRTC communication.
  • 360° Video Reprojection: Dynamically transforms 360° equirectangular video into a perspective view based on the user's head orientation, providing an immersive FPV experience.
  • Full Body Tracking: Receives, deserializes, and processes full-body, upper-body, and hand skeleton data from XR clients (e.g., a Unity app).
  • Modular & Extensible: Easily replace video sources (e.g., from a file, a live camera, or a GoPro stream) and video transformations with your own custom classes.
  • Hardware Acceleration: Utilizes FFmpeg's hardware acceleration capabilities (videotoolbox, d3d11va, vaapi, etc.) for efficient video decoding, minimizing CPU load.
  • Coordinate System Handling: Includes utilities to convert coordinate systems between different platforms (e.g., Unity's left-handed Y-up to a standard right-handed Z-up).
  • Rich Data Schemas: Provides clear Python enumerations for OpenXR and OVR skeleton bone IDs, simplifying data interpretation.
  • Integrated Visualization: Comes with an example to visualize the received body pose data in 3D using the Rerun SDK.

System Architecture

The server acts as a bridge between an XR client (e.g., a VR headset running a Unity application) and a robot control system. The client sends head pose and body tracking data to the server, which in turn processes it and sends back a reprojected video stream. The processed body pose can then be used to command a robot or a simulation.

graph TD
    subgraph "XR Client (e.g., Unity on Headset)"
        A[User Input: Head Pose + Body Tracking] --> B{WebRTC Connection};
        B --> C["Send Data Channels (body_pose, camera)"];
        D[Receive Video Stream] --> E[Display to User];
        B --> D;
    end

    subgraph "Python Server (xr-robot-teleop-server)"
        F{WebRTCServer};
        G[Data Channel Handlers];
        H[Video Pipeline];

        F <--> G;
        F <--> H;

        subgraph G [Data Channel Handlers]
            G1["'body_pose' handler"];
            G2["'camera' handler"];
            G1 --> G3[Deserialize Pose Data];
            G2 --> G4[Update Head Pose State];
        end

        subgraph H [Video Pipeline]
            H1[VideoSource: 360° Video];
            H2[VideoTransform: Reprojection];
            H3[Reprojected Video Track];
            H1 --> H2;
            H2 --> H3;
        end

        G4 --> H2;
    end

    subgraph "Robot Control System (Application Layer)"
        I["Robot/Simulation (e.g. MuJoCo)"];
        G3 --> I;
    end


    C --> F;
    H3 --> D;

    style XR Client fill:#d4fcd7,stroke:#333,stroke-width:2px
    style Python Server fill:#cde4f7,stroke:#333,stroke-width:2px
    style Robot Control System fill:#f7e8cd,stroke:#333,stroke-width:2px

Getting Started

Installation

pip install xr-robot-teleop-server

Development

  1. Clone the repository:

    git clone https://github.com/your-username/xr-robot-teleop-server.git
    cd xr-robot-teleop-server
    
  2. Install the package: For basic usage, install the package in editable mode:

    pip install -e .
    

    To include dependencies for the visualization example (rerun), install with the [viz] extra:

    pip install -e ".[viz]"
    

Running the Examples

This project includes examples to demonstrate its capabilities.

Record and Visualize Full Body Pose

This example starts a WebRTC server that listens for full-body tracking data from a client, deserializes it, and visualizes the skeleton in 3D using Rerun.

  1. Run the server:

    python examples/record_full_body_pose.py --visualize
    

    The server will start and print: INFO Starting xr-robot-teleop-server...

  2. Connect your client: Connect your XR client to the server at http://<your-server-ip>:8080. Once the WebRTC connection is established, the server will start receiving data.

  3. Visualize: A Rerun viewer window will spawn, displaying the received skeleton poses in real-time.

    Rerun Visualization Example

Usage

The core of the server is the WebRTCServer class. You can instantiate it with factories for creating video tracks and handlers for data channels.

from xr_robot_teleop_server.streaming import WebRTCServer
from xr_robot_teleop_server.sources import FFmpegFileSource
from xr_robot_teleop_server.transforms import EquilibEqui2Pers
from aiortc import MediaStreamTrack
import numpy as np

# 1. Define a shared state object (optional)
# This object is shared between the video track and data channel handlers
# for a single peer connection.
class AppState:
    def __init__(self):
        self.pitch = 0.0
        self.yaw = 0.0
        self.roll = 0.0

# 2. Define a handler for an incoming data channel
def on_camera_message(message: str, state: AppState):
    # Parse message and update the shared state
    data = json.loads(message)
    state.pitch = np.deg2rad(float(data.get("pitch", 0.0)))
    state.yaw = np.deg2rad(float(data.get("yaw", 0.0)))

# 3. Define a factory for the video track
# This factory creates the video source, the transform, and the track itself.
class ReprojectionTrack(MediaStreamTrack):
    kind = "video"

    def __init__(self, state: AppState):
        super().__init__()
        self.state = state
        self.source = FFmpegFileSource("path/to/your/360_video.mp4")
        self.transform = EquilibEqui2Pers(output_width=1280, output_height=720, fov_x=90.0)

    async def recv(self):
        equi_frame_rgb = next(self.source)
        rot = {"pitch": self.state.pitch, "yaw": self.state.yaw, "roll": self.state.roll}
        perspective_frame = self.transform.transform(frame=equi_frame_rgb, rot=rot)
        # ... create and return av.VideoFrame ...

# 4. Configure and run the server
if __name__ == "__main__":
    data_handlers = {
        "camera": on_camera_message,
        # Add other handlers for "body_pose", "left_hand", etc.
    }

    server = WebRTCServer(
        state_factory=AppState,
        video_track_factory=ReprojectionTrack,
        datachannel_handlers=data_handlers,
    )

    server.run()

Project Structure

.
├── examples/                 # Example scripts showing how to use the server
├── src/
│   └── xr_robot_teleop_server/
│       ├── schemas/          # Data schemas for skeletons (OpenXR) and poses
│       ├── sources/          # Pluggable video sources (FFmpeg, OpenCV)
│       ├── streaming/        # Core WebRTC server logic
│       └── transforms/       # Pluggable video transformations (e.g., reprojection)
├── tests/                    # Unit and integration tests
└── pyproject.toml            # Project metadata and dependencies

Dependencies

  • Core: aiortc, fastapi, uvicorn, numpy, loguru
  • Transforms: equilib
  • Video Sources: opencv-python (optional), ffmpeg (must be in system PATH for FFmpegFileSource)
  • Visualization: rerun-sdk

See pyproject.toml for detailed dependency information.

License

This project is licensed under the terms of the LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xr_robot_teleop_server-0.1.6.tar.gz (3.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xr_robot_teleop_server-0.1.6-py3-none-any.whl (27.1 kB view details)

Uploaded Python 3

File details

Details for the file xr_robot_teleop_server-0.1.6.tar.gz.

File metadata

  • Download URL: xr_robot_teleop_server-0.1.6.tar.gz
  • Upload date:
  • Size: 3.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for xr_robot_teleop_server-0.1.6.tar.gz
Algorithm Hash digest
SHA256 bdb8d0ec7d0f9e25337c694a27791e77ae920acfdc5b0ec5b1ff8874d32359fb
MD5 e234f71bf3b178ca6f9ef0b7ea3e91b3
BLAKE2b-256 7064181593bc04df638239226ac814fcbaf75070a65d8ce2e1ccd58b03b181a3

See more details on using hashes here.

File details

Details for the file xr_robot_teleop_server-0.1.6-py3-none-any.whl.

File metadata

File hashes

Hashes for xr_robot_teleop_server-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 cb444247ddf6bfa9388b3646b3b38e5d979776ff87332ade3041a202366040cb
MD5 ead718b8ab7c29884a1531d74bd77cce
BLAKE2b-256 ab3d955a4ee5f94390243bc821f3596d92573ae282dff92e79c8324c2e221401

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page