Dynamic acoustic simulation for robotic navigation and localization
Project description
Acoustix
Acoustix is a comprehensive Python library for dynamic acoustic simulation designed specifically for robotics research. It enables realistic simulation of reverberant acoustic environments with multiple sound sources and microphone arrays, making it ideal for developing and testing sound-driven navigation, source localization, and audio-based robotic perception systems.
This project was developed as part of my PhD project, realized in the RobotLearn team, at Inria Grenoble, under the supervision of Dr. Xavier Alameda, Pr. Laurent Girin and Dr. Chris Reinke. You can learn more about this library, its motivations, its applications and the relevant scientific and technical decisions in my PhD manuscript:
- Chapter 2: Introduction of acoustics in reverberant environments, and presentation of the Acoustics library
- Chapter 3: Deep-learning-based sound source localization
- Chapter 4: Active sound source localization
- Chapter 5: Deep reinforcement learning for sound-driven navigation
The code for all my experiments can be found here.
✨ Key Features
- 🏠 Realistic Room Acoustics: Simulate reverberant environments with customizable room dimensions and acoustic properties (RT60, absorption coefficients)
- 🎤 Multiple Microphone Arrays: Support for various array geometries including binaural, linear, square, and triangular configurations
- 🔊 Diverse Sound Sources: Speech sources (LibriSpeech integration), white noise, and custom audio sources with spatial positioning
- 🚀 High-Performance Backends: Leverages both gpuRIR and Pyroomacoustics for fast Room Impulse Response (RIR) generation
- 🧠 Spatial Audio Processing: Built-in STFT computation, DOA (Direction of Arrival) estimation, and ILD/IPD analysis
- 🗺️ Egocentric Audio Maps: Generate spatial representations of the acoustic environment from the agent's perspective
- 🎮 Dynamic Simulation: Real-time agent movement and source repositioning during simulation
- 📊 Rich Visualization: Integrated plotting capabilities for room geometry, source positions, and audio signals
🚀 Quick Start
Installation
pip install acoustix
For speech sources, download the LibriSpeech dataset (optional):
# Install dependencies
sudo apt install tar curl parallel ffmpeg
# Download LibriSpeech train-clean-100 subset
./acoustix/datasets/download_librispeech.sh
Basic Usage
WARNING! The origin of the coordinate system is always in the top-left!
import numpy as np
from acoustix import GpuRirRoom, AudioSimulator
from acoustix.microphone_arrays import BinauralArray
# Create a reverberant room
room = GpuRirRoom(
size_x=8.0, # Room dimensions in meters
size_y=6.0,
height=3.0,
rt_60=0.5, # Reverberation time in seconds
sampling_freq=16_000, # Sampling frequency
)
# Set up a binaural microphone array (robot's "ears")
array = BinauralArray(
mic_dist=10, # Distance between microphones in cm
position=np.array([3.5, 2.0, 1.2]), # Agent position (x, y, z)
orientation=np.array([0, 1, 0]), # Agent orientation
mic_pattern="card", # Microphone pattern
)
# Initialize the simulator with multiple speech sources
simulator = AudioSimulator(
room=room,
mic_array=array,
n_speech_sources=2, # Number of speech sources
max_audio_samples=4 * room.sampling_frequency, # 4 seconds of audio
)
# Run simulation
simulator.step()
# Get the multi-channel audio signal
audio = simulator.get_agent_audio() # Shape: (n_mics, n_samples)
# Get spectral representation
stft = simulator.get_agent_stft() # Shape: (n_mics, n_freq, n_frames)
# Extract spatial information
doa = simulator.get_doa(source_name="speech_1") # Direction of arrival
distance = simulator.get_source_array_dist(source_name="speech_1") # Source-array distance
📚 Core Components
AudioSimulator
The main interface that orchestrates room simulation, source management, and audio processing:
simulator = AudioSimulator(
room=room,
mic_array=array,
n_speech_sources=3,
source_continuous=True, # Continuous speech streams
max_audio_samples=160_000, # 10 seconds at 16kHz
)
# Dynamic agent movement
simulator.move_agent(
new_position=np.array([5.0, 3.0, 1.2]),
new_orientation=np.array([1, 0, 0]),
)
# Step simulation
simulator.step()
Room Models
Choose between two backends:
# GPU-accelerated RIR generation (recommended)
from acoustix import GpuRirRoom
room = GpuRirRoom(size_x=10, size_y=8, height=3], rt_60=0.6)
# CPU-based alternative
from acoustix import PyRoomAcousticsRoom
room = PyRoomAcousticsRoom(size_x=10, size_y=8, height=3], rt_60=0.6)
Microphone Arrays
Multiple array geometries for different robotic platforms:
from acoustix.microphone_arrays import (
MonoArray # Single microphone
BinauralArray, # 2 microphones (human-like hearing)
UniformLinearArray, # Linear array with N microphones
SquareArray, # 2x2 square configuration
TriangleArray, # 3-microphone triangular setup
)
# Linear array with 4 microphones
linear_array = UniformLinearArray(
n_mics=4,
mic_spacing=5, # 5cm spacing
position=np.array([2, 2, 1.5]),
)
Sound Sources
Various source types for different scenarios:
from acoustix.room import SpeechSource, WhiteNoiseSource, MusicNoiseSource
# Speech source (uses LibriSpeech)
speech = SpeechSource(
name="speech_1",
position=np.array([6, 4, 1.6])
)
# White noise source
noise = WhiteNoiseSource(
name="ambient_noise",
position=np.array([1, 1, 2.5]),
num_samples=160_000,
)
Egocentric Audio Maps
Generate spatial representations from the agent's perspective:
from acoustix.egocentric_map import EgocentricMap, PolarRelativePosition
em: EgocentricMap = EgocentricMap(
size=6,
size_pixel=128,
doa_res=360,
)
doas: list[float] = [
-np.pi / 2,
np.pi / 4,
np.pi / 5,
]
encoded_doas: Tensor = encode_sources(sources_doas=doas)
em.apply_doa(doas=encoded_doas.numpy())
em.sources_positions = [
PolarRelativePosition(
dist=0.4 * em.size,
angle=angle,
)
for angle in doas
]
em.plot()
em.move(
angle=0.1,
dist=0.5,
)
em.plot()
🎯 Applications
Acoustix is particularly well-suited for:
- 🤖 Sound-Driven Navigation: Training robots to navigate using audio cues
- 🎯 Sound Source Localization: Developing DOA estimation algorithms
- 🔊 Audio Scene Analysis: Understanding complex acoustic environments
- 🧠 Machine Learning: Generating training data for deep learning models
- 📡 Multi-modal Robotics: Integrating audio with other sensor modalities
📖 Advanced Examples
Multi-Source Scenario with Noise
# Complex acoustic scene
room = GpuRirRoom(size_x=12, size_y=10, height=4, rt_60=0.8)
array = SquareArray(center_to_mic_dist=4, position=np.array([6, 5, 1.8]))
simulator = AudioSimulator(
room=room,
mic_array=array,
n_speech_sources=3,
noise_source=True,
noise_source_type="white_noise",
source_continuous=True
)
# Simulate agent movement through the environment
positions = [
np.array([2, 2, 1.8]),
np.array([6, 5, 1.8]),
np.array([10, 8, 1.8]),
]
for pos in positions:
simulator.move_agent(new_position=pos)
simulator.step()
audio = simulator.get_agent_audio()
Real-time Audio Processing
import matplotlib.pyplot as plt
# Run simulation and visualize results
simulator.step()
# Get time-domain signals
audio = simulator.get_agent_audio()
# Plot microphone signals
fig, axes = plt.subplots(2, 2, figsize=(12, 8))
for i, ax in enumerate(axes.flat):
if i < audio.shape[0]:
ax.plot(audio[i])
ax.set_title(f'Microphone {i+1}')
ax.set_xlabel('Sample')
ax.set_ylabel('Amplitude')
plt.tight_layout()
plt.show()
# Get and plot spectrograms
stft = simulator.get_agent_stft()
# ... visualization code
🧪 Testing
Run the test suite to verify your installation:
pytest tests/
📄 Citation
If you use Acoustix in your research, please cite:
@phdthesis{acoustix_phd,
title={From Sound to Action: Deep Learning for Audio-Based Localization and Navigation in Robotics},
author={Lepage, Gaétan},
school={Université Grenoble Alpes},
year={2025},
url={https://theses.fr/s253609}
}
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
📜 License
This project is licensed under the MIT License - see the LICENSE file for details.
🙏 Acknowledgments
- This work was funded by the SPRING European project.
- This simulator is built upon gpuRIR and Pyroomacoustics RIR generation libraries
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file acoustix-1.0.0.tar.gz.
File metadata
- Download URL: acoustix-1.0.0.tar.gz
- Upload date:
- Size: 203.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
09506550ec9030b95b6b3c646ad4bd95820269eac7d66daa8b16afb4eeddc390
|
|
| MD5 |
ef916785a6e6db7d52e7f98df40bc950
|
|
| BLAKE2b-256 |
b602197e0519706291abd1155ad2977726b00bc5014652a0c5dca1d5879a312e
|
File details
Details for the file acoustix-1.0.0-py3-none-any.whl.
File metadata
- Download URL: acoustix-1.0.0-py3-none-any.whl
- Upload date:
- Size: 71.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
507744c6c63ce12b36a84ff53e7ec51b68b9545d12f47ba416f851e6286c2ff2
|
|
| MD5 |
0866c30320f1ba1e212c8b140e02ff1d
|
|
| BLAKE2b-256 |
41641288c934d8a53292f4b9886d186d3bf8d9204e3c28645c18f124236093d3
|