Skip to main content

PyTorch-based room impulse response (RIR) simulation toolkit for static and dynamic scenes.

Project description

TorchRIR

PyTorch-based room impulse response (RIR) simulation toolkit focused on a clean, modern API with GPU support. This project has been substantially assisted by AI using Codex.

License

Apache-2.0. See LICENSE and NOTICE.

Installation

pip install torchrir

Current Capabilities

  • ISM-based static and dynamic RIR simulation (2D/3D shoebox rooms).
  • Directivity patterns: omni, cardioid, hypercardioid, subcardioid, bidir with orientation handling.
  • Acoustic parameters: beta or t60 (Sabine), optional diffuse tail via tdiff.
  • Dynamic convolution via DynamicConvolver (trajectory or hop modes).
  • GPU acceleration for ISM accumulation (CUDA/MPS; MPS disables LUT).
  • Dataset utilities with CMU ARCTIC support and example pipelines.
  • Plotting utilities for static and dynamic scenes.
  • Metadata export helpers for time axis, DOA, and array attributes (JSON-ready).
  • Unified CLI with JSON/YAML config and deterministic flag support.

Example Usage

# CMU ARCTIC + static RIR (fixed sources/mics)
uv run python examples/static.py --plot

# Dynamic RIR demos
uv run python examples/dynamic_mic.py --plot
uv run python examples/dynamic_src.py --plot
uv run python examples/dynamic_mic.py --gif
uv run python examples/dynamic_src.py --gif

# Unified CLI
uv run python examples/cli.py --mode static --plot
uv run python examples/cli.py --mode dynamic_mic --plot
uv run python examples/cli.py --mode dynamic_src --plot
uv run python examples/cli.py --mode dynamic_mic --gif
uv run python examples/dynamic_mic.py --gif --gif-fps 12

# Config + deterministic
uv run python examples/cli.py --mode static --deterministic --seed 123 --config-out outputs/cli.json
uv run python examples/cli.py --config-in outputs/cli.json

GIF FPS is auto-derived from signal duration and RIR steps unless overridden with --gif-fps. For 3D rooms, an additional *_3d.gif is saved. YAML configs are supported when PyYAML is installed.

# YAML config
uv run python examples/cli.py --mode static --config-out outputs/cli.yaml
uv run python examples/cli.py --config-in outputs/cli.yaml

examples/cli_example.yaml provides a ready-to-use template. Examples also save *_metadata.json alongside audio outputs.

from torchrir import DynamicConvolver, MicrophoneArray, Room, Source, simulate_rir

room = Room.shoebox(size=[6.0, 4.0, 3.0], fs=16000, beta=[0.9] * 6)
sources = Source.positions([[1.0, 2.0, 1.5]])
mics = MicrophoneArray.positions([[2.0, 2.0, 1.5]])

rir = simulate_rir(
    room=room,
    sources=sources,
    mics=mics,
    max_order=6,
    tmax=0.3,
    device="auto",
)
from torchrir import DynamicConvolver

# Trajectory-mode dynamic convolution
y = DynamicConvolver(mode="trajectory").convolve(signal, rirs)

# Hop-mode dynamic convolution
y = DynamicConvolver(mode="hop", hop=1024).convolve(signal, rirs)

Dynamic convolution is exposed via DynamicConvolver only (no legacy function wrappers).

Limitations and Potential Errors

  • Ray tracing and FDTD simulators are placeholders and raise NotImplementedError.
  • TemplateDataset methods are not implemented and will raise NotImplementedError.
  • simulate_rir/simulate_dynamic_rir require max_order (or SimulationConfig.max_order) and either nsample or tmax.
  • Non-omni directivity requires orientation; mismatched shapes raise ValueError.
  • beta must have 4 (2D) or 6 (3D) elements; invalid sizes raise ValueError.
  • simulate_dynamic_rir requires src_traj and mic_traj to have matching time steps.
  • Dynamic simulation currently loops per time step; very long trajectories can be slow.
  • MPS disables the sinc LUT path (falls back to direct sinc), which can be slower and slightly different numerically.
  • Deterministic mode is best-effort; some backends may still be non-deterministic.
  • YAML configs require PyYAML; otherwise a ModuleNotFoundError is raised.
  • CMU ARCTIC downloads require network access.
  • GIF animation output requires Pillow (via matplotlib animation writer).

Dataset-agnostic utilities

from torchrir import (
    CmuArcticDataset,
    binaural_mic_positions,
    clamp_positions,
    load_dataset_sources,
    sample_positions,
)

def dataset_factory(speaker: str | None):
    spk = speaker or "bdl"
    return CmuArcticDataset("datasets/cmu_arctic", speaker=spk, download=True)

signals, fs, info = load_dataset_sources(
    dataset_factory=dataset_factory,
    num_sources=2,
    duration_s=10.0,
    rng=random.Random(0),
)

Dataset template (for future extension)

TemplateDataset provides a minimal stub to implement new datasets later.

Logging

from torchrir import LoggingConfig, get_logger, setup_logging

setup_logging(LoggingConfig(level="INFO"))
logger = get_logger("examples")
logger.info("running torchrir example")

Scene container

from torchrir import Scene

scene = Scene(room=room, sources=sources, mics=mics, src_traj=src_traj, mic_traj=mic_traj)
scene.validate()

Immutable geometry helpers

Room, Source, and MicrophoneArray are immutable; use .replace() to update fields.

Result container

from torchrir import RIRResult

result = RIRResult(rirs=rirs, scene=scene, config=config)

Simulation strategies

from torchrir import ISMSimulator

sim = ISMSimulator()
result = sim.simulate(scene, config)

Device Selection

  • device="cpu": CPU execution
  • device="cuda": NVIDIA GPU (CUDA) if available, otherwise fallback to CPU
  • device="mps": Apple Silicon GPU via Metal (MPS) if available, otherwise fallback to CPU
  • device="auto": prefer CUDA → MPS → CPU
from torchrir import DeviceSpec

device, dtype = DeviceSpec(device="auto").resolve()

References

Specification (Current)

Purpose

  • Provide room impulse response (RIR) simulation on PyTorch with CPU/CUDA/MPS support.
  • Support static and dynamic scenes with a maintainable, modern API.

Room Model

  • Shoebox (rectangular) room model.
  • 2D or 3D.
  • Image Source Method (ISM) implementation.

Inputs

Scene Geometry

  • Room size: [Lx, Ly, Lz] (2D uses [Lx, Ly]).
  • Source positions: (n_src, dim).
  • Microphone positions: (n_mic, dim).
  • Reflection order: max_order.

Acoustic Parameters

  • Sample rate: fs.
  • Speed of sound: c (default 343.0 m/s).
  • Wall reflection coefficients: beta (4 faces for 2D, 6 for 3D) or t60 (Sabine).

Output Length

  • Specify nsample (samples) or tmax (seconds).

Directivity

  • Patterns: omni, cardioid, hypercardioid, subcardioid, bidir.
  • Orientation specified by vector or angles.

Configuration

  • SimulationConfig controls algorithm settings (e.g., max_order, tmax, directivity, device, seed, fractional delay length, LUT, chunk sizes, compile path).
  • Passed explicitly via simulate_rir(..., config=...) or simulate_dynamic_rir(..., config=...).

Outputs

  • Static RIR shape: (n_src, n_mic, nsample).
  • Dynamic RIR shape: (T, n_src, n_mic, nsample).
  • Preserves dtype/device.

Core APIs

Static RIR

room = Room.shoebox(size=[6.0, 4.0, 3.0], fs=16000, beta=[0.9] * 6)
sources = Source.positions([[1.0, 2.0, 1.5], [4.5, 1.0, 1.2]])
mics = MicrophoneArray.positions([[2.0, 2.0, 1.5], [3.0, 2.0, 1.5]])

rir = simulate_rir(
    room=room,
    sources=sources,
    mics=mics,
    max_order=8,
    tmax=0.4,
    directivity="omni",
    device="auto",
)

Dynamic RIRs + Convolution

rirs = simulate_dynamic_rir(
    room=room,
    src_traj=src_traj,   # (T, n_src, dim)
    mic_traj=mic_traj,   # (T, n_mic, dim)
    max_order=8,
    tmax=0.4,
    device="auto",
)

y = DynamicConvolver(mode="trajectory").convolve(signal, rirs)

Device Control

  • device="cpu", "cuda", "mps", or "auto"; resolves with fallback to CPU.

Future Work

  • Ray tracing backend: implement RayTracingSimulator with frequency-dependent absorption/scattering.
  • FDTD backend: implement FDTDSimulator with configurable grid resolution and boundary conditions.
  • Dataset expansion: add additional dataset integrations beyond CMU ARCTIC (see TemplateDataset).
  • Enhanced acoustics: frequency-dependent absorption and more advanced diffuse tail models.
  • Add microphone and source directivity models similar to gpuRIR/pyroomacoustics.
  • Add regression tests comparing generated RIRs against gpuRIR outputs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

torchrir-0.1.2.tar.gz (40.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

torchrir-0.1.2-py3-none-any.whl (42.8 kB view details)

Uploaded Python 3

File details

Details for the file torchrir-0.1.2.tar.gz.

File metadata

  • Download URL: torchrir-0.1.2.tar.gz
  • Upload date:
  • Size: 40.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.30 {"installer":{"name":"uv","version":"0.9.30","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for torchrir-0.1.2.tar.gz
Algorithm Hash digest
SHA256 8e83899458d6d325b20f6340e1ecbd08a390ca7e2ceceeefeaa89bd0cbd8f1ef
MD5 e13c6b62d7496d34614e69fc447133db
BLAKE2b-256 57ae777b429d20d19ad2260da80dfc164ef00732458f77aeff8483b0c68d532d

See more details on using hashes here.

File details

Details for the file torchrir-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: torchrir-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 42.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.30 {"installer":{"name":"uv","version":"0.9.30","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for torchrir-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 fffa1211c6fe5f1f7a9727d5e2f5e8b437befe6b34935024728ea689478a1cc7
MD5 6e9af4890893e39dc7979318488f54f0
BLAKE2b-256 99315138ff9731f86cbefcf082c0d4121a8ba64695e6d8902add62dfc5ae030f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page