Skip to main content

Community tools for NVIDIA's Alpamayo/PhysicalAI-AV ecosystem

Project description

alpamayo-tools

Community tools for NVIDIA's Alpamayo-R1 and PhysicalAI-AV ecosystem.

Overview

This package provides:

  • PhysicalAIDataset — PyTorch Dataset that handles video decoding, egomotion interpolation, and coordinate transformation to ego-frame. Useful for training your own models on PhysicalAI-AV without writing the data loading boilerplate.

  • alpamayo-generate-labels — CLI for running Alpamayo-R1 inference at scale. Supports checkpointing, resume, and multi-GPU sharding. Useful for distillation workflows where you need teacher labels for thousands of clips.

  • CoCEmbedder — Sentence embedding for Chain-of-Cognition reasoning text. Useful for retrieval, clustering, or analyzing Alpamayo's reasoning outputs.

All trajectory data is automatically transformed to the ego vehicle's local frame at t0 (the coordinate system Alpamayo-R1 expects).

Installation

pip install alpamayo-tools

With optional dependencies:

pip install alpamayo-tools[embeddings]  # CoC embeddings
pip install alpamayo-tools[inference]   # Alpamayo inference wrapper
pip install alpamayo-tools[all]         # Everything

For inference, also install alpamayo_r1:

pip install git+https://github.com/NVlabs/alpamayo.git

Usage

DataLoader

from alpamayo_tools import PhysicalAIDataset, DatasetConfig, collate_fn
from torch.utils.data import DataLoader

config = DatasetConfig(
    clip_ids=["clip_001", "clip_002"],
    cameras=("camera_front_wide_120fov", "camera_front_tele_30fov"),
    num_frames=4,
)

dataset = PhysicalAIDataset(config)

# Recommended: Download all data upfront (MUCH faster than streaming)
dataset.download()  # Downloads by chunk, uses parallel workers

loader = DataLoader(dataset, batch_size=4, collate_fn=collate_fn)

for batch in loader:
    frames = batch["frames"]  # (B, N_cam, T, 3, H, W)
    history = batch["ego_history_xyz"]  # (B, 16, 3)
    future = batch["ego_future_xyz"]  # (B, 64, 3)

Why download first? The PhysicalAI-AV dataset is organized by chunks, each containing multiple clips. Downloading by chunk is much faster than streaming clip-by-clip (one HTTP request per chunk vs per clip). The download() method groups your clips by chunk and downloads them with parallel workers.

For quick testing with a few clips, you can still stream:

config = DatasetConfig(clip_ids=["clip_001"], stream=True)

Inference

from alpamayo_tools.inference import AlpamayoPredictor
import torch

predictor = AlpamayoPredictor.from_pretrained("nvidia/Alpamayo-R1-10B", dtype=torch.bfloat16)
result = predictor.predict_from_clip("clip_001", t0_us=5_100_000)

print(result.trajectory_xyz.shape)  # (64, 3)
print(result.reasoning_text)

Generate Teacher Labels

alpamayo-generate-labels \
    --clip-ids-file train_clips.parquet \
    --output-dir ./labels

# Multi-GPU
CUDA_VISIBLE_DEVICES=0 alpamayo-generate-labels --clip-ids-file clips.parquet --output-dir ./labels --shard 0/4
CUDA_VISIBLE_DEVICES=1 alpamayo-generate-labels --clip-ids-file clips.parquet --output-dir ./labels --shard 1/4

# Resume after interruption
alpamayo-generate-labels --clip-ids-file clips.parquet --output-dir ./labels --resume

CoC Embeddings

from alpamayo_tools import CoCEmbedder

embedder = CoCEmbedder()
embeddings = embedder.embed(["The vehicle ahead is braking."])  # (1, 384)

Dataset Output

Key Shape Description
frames (N_cam, T, 3, H, W) Camera frames (uint8)
ego_history_xyz (16, 3) Past 1.6s trajectory in ego frame
ego_history_rot (16, 3, 3) Past rotations in ego frame
ego_future_xyz (64, 3) Future 6.4s trajectory in ego frame
ego_future_rot (64, 3, 3) Future rotations in ego frame
clip_id str Clip identifier (clip_ids when batched)
t0_us int Reference timestamp (microseconds)

Requirements

  • Python 3.12+
  • PyTorch 2.0+
  • For inference: GPU with 24GB+ VRAM

Related

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

alpamayo_tools-0.2.0.tar.gz (119.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

alpamayo_tools-0.2.0-py3-none-any.whl (17.8 kB view details)

Uploaded Python 3

File details

Details for the file alpamayo_tools-0.2.0.tar.gz.

File metadata

  • Download URL: alpamayo_tools-0.2.0.tar.gz
  • Upload date:
  • Size: 119.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for alpamayo_tools-0.2.0.tar.gz
Algorithm Hash digest
SHA256 846e12a60c35c469a68d3810eb32a469b2f4950dec7dcf57aff167a69b53ab07
MD5 eeb0db7552e6d31cebd9ea5f5cd63e9c
BLAKE2b-256 b678f735a163737a87a0ddf989e92eed511ac71278d022537101a9b5e864553b

See more details on using hashes here.

File details

Details for the file alpamayo_tools-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: alpamayo_tools-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 17.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for alpamayo_tools-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8c5bd4924de82173b0e37e2c66a9d3acdd59a7ef802dec004f19878ea4b48ed8
MD5 48066a81734f8c1c5d2e09a7933bf49b
BLAKE2b-256 8d7a51f2e16bdd7d69dc100039ee51d75b2ce69f3b1137577dc9225deb4d9fb6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page