Skip to main content

A simple framework for room acoustics and audio processing in Python.

Project description

https://travis-ci.org/LCAV/pyroomacoustics.svg?branch=pypi-release Documentation Status

Summary

Pyroomacoustics is a software package aimed at the rapid development and testing of audio array processing algorithms. The content of the package can be divided into three main components: an intuitive Python object-oriented interface to quickly construct different simulation scenarios involving multiple sound sources and microphones in 2D and 3D rooms; a fast C implementation of the image source model for general polyhedral rooms to efficiently generate room impulse responses and simulate the propagation between sources and receivers; and finally, reference implementations of popular algorithms for beamforming, direction finding, and adaptive filtering. Together, they form a package with the potential to speed up the time to market of new algorithms by significantly reducing the implementation overhead in the performance evaluation step.

Room Acoustics Simulation

Consider the following scenario.

Suppose, for example, you wanted to produce a radio crime drama, and it so happens that, according to the scriptwriter, the story line absolutely must culminate in a satanic mass that quickly degenerates into a violent shootout, all taking place right around the altar of the highly reverberant acoustic environment of Oxford’s Christ Church cathedral. To ensure that it sounds authentic, you asked the Dean of Christ Church for permission to record the final scene inside the cathedral, but somehow he fails to be convinced of the artistic merit of your production, and declines to give you permission. But recorded in a conventional studio, the scene sounds flat. So what do you do?

—Schnupp, Nelken, and King, Auditory Neuroscience, 2010

Faced with this difficult situation, pyroomacoustics can save the day by simulating the environment of the Christ Church cathedral!

At the core of the package is a room impulse response (RIR) generator based on the image source model that can handle

  • Convex and non-convex rooms

  • 2D/3D rooms

Both a pure python implementation and a C accelerator are included for maximum speed and compatibility.

The philosophy of the package is to abstract all necessary elements of an experiment using object oriented programming concept. Each of these elements is represented using a class and an experiment can be designed by combining these elements just as one would do in a real experiment.

Let’s imagine we want to simulate a delay-and-sum beamformer that uses a linear array with four microphones in a shoe box shaped room that contains only one source of sound. First, we create a room object, to which we add a microphone array object, and a sound source object. Then, the room object has methods to compute the RIR between source and receiver. The beamformer object then extends the microphone array class and has different methods to compute the weights, for example delay-and-sum weights. See the example below to get an idea of what the code looks like.

The Room class also allows one to process sound samples emitted by sources, effectively simulating the propagation of sound between sources and microphones. At the input of the microphones composing the beamformer, an STFT (short time Fourier transform) engine allows to quickly process the signals through the beamformer and evaluate the output.

Reference Implementations

In addition to its core image source model simulation, pyroomacoustics also contains a number of reference implementations of popular audio processing algorithms for

  • beamforming

  • direction of arrival (DOA) finding

  • adaptive filtering (NLMS, RLS)

  • blind source separation (AuxIVA, Trinicon)

We use an object-oriented approach to abstract the details of specific algorithms, making them easy to compare. Each algorithm can be tuned through optional parameters. We have tried to pre-set values for the tuning parameters so that a run with the default values will in general produce reasonable results.

Datasets

In an effort to simplify the use of datasets, we provide a few wrappers that allow to quickly load and sort through some popular speech corpora. At the moment we support the following.

Quick Install

Install the package with pip:

$ pip install pyroomacoustics

The requirements are:

* numpy
* scipy
* matplotlib

Example

import numpy as np
import matplotlib.pyplot as plt
import pyroomacoustics as pra

# Create a 4 by 6 metres shoe box room
room = pra.ShoeBox([4,6])

# Add a source somewhere in the room
room.add_source([2.5, 4.5])

# Create a linear array beamformer with 4 microphones
# with angle 0 degrees and inter mic distance 10 cm
R = pra.linear_2D_array([2, 1.5], 4, 0, 0.04)
room.add_microphone_array(pra.Beamformer(R, room.fs))

# Now compute the delay and sum weights for the beamformer
room.mic_array.rake_delay_and_sum_weights(room.sources[0][:1])

# plot the room and resulting beamformer
room.plot(freq=[1000, 2000, 4000, 8000], img_order=0)
plt.show()

Authors

  • Robin Scheibler

  • Ivan Dokmanić

  • Sidney Barthe

  • Eric Bezzam

  • Hanjie Pan

How to contribute

If you would like to contribute, please clone the repository and send a pull request.

Academic publications

This package was developed to support academic publications. The package contains implementations for DOA algorithms and acoustic beamformers introduced in the following papers.

  • H. Pan, R. Scheibler, I. Dokmanic, E. Bezzam and M. Vetterli. FRIDA: FRI-based DOA estimation for arbitrary array layout, ICASSP 2017, New Orleans, USA, 2017.

  • I. Dokmanić, R. Scheibler and M. Vetterli. Raking the Cocktail Party, in IEEE Journal of Selected Topics in Signal Processing, vol. 9, num. 5, p. 825 - 836, 2015.

  • R. Scheibler, I. Dokmanić and M. Vetterli. Raking Echoes in the Time Domain, ICASSP 2015, Brisbane, Australia, 2015.

If you use this package in your own research, please cite our paper describing it.

R. Scheibler, E. Bezzam, I. Dokmanić, Pyroomacoustics: A Python package for audio room simulations and array processing algorithms, Proc. IEEE ICASSP, Calgary, CA, 2018.

License

Copyright (c) 2014-2017 EPFL-LCAV

Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
of the Software, and to permit persons to whom the Software is furnished to do
so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyroomacoustics-0.1.18.tar.gz (136.8 kB view details)

Uploaded Source

Built Distributions

pyroomacoustics-0.1.18-cp36-cp36m-win_amd64.whl (159.2 kB view details)

Uploaded CPython 3.6m Windows x86-64

pyroomacoustics-0.1.18-cp36-cp36m-win32.whl (157.5 kB view details)

Uploaded CPython 3.6m Windows x86

pyroomacoustics-0.1.18-cp35-cp35m-win_amd64.whl (159.2 kB view details)

Uploaded CPython 3.5m Windows x86-64

pyroomacoustics-0.1.18-cp35-cp35m-win32.whl (157.5 kB view details)

Uploaded CPython 3.5m Windows x86

File details

Details for the file pyroomacoustics-0.1.18.tar.gz.

File metadata

File hashes

Hashes for pyroomacoustics-0.1.18.tar.gz
Algorithm Hash digest
SHA256 b89c2bf43963c5b9f701af79ee4f6216ca65de7e3e5375e478d4f10a15a95904
MD5 bc8fb1a06af80e975bbe4aec005729a7
BLAKE2b-256 35e099dedd3e8c9c9c6bcc3526c39453e1fc36d86f13c5eb5f41e351286dcc2a

See more details on using hashes here.

File details

Details for the file pyroomacoustics-0.1.18-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for pyroomacoustics-0.1.18-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 b18418b1d34f119d33bb27b193a5c4d9ff6c38e5bfa47c29dc88e1409962424d
MD5 2dd6ab1f32ebfc5539aa5bc6ffbd8a4e
BLAKE2b-256 5381bf44a825e68b643f63abcea5cc049b70f421a01910e960540a00b45c706f

See more details on using hashes here.

File details

Details for the file pyroomacoustics-0.1.18-cp36-cp36m-win32.whl.

File metadata

File hashes

Hashes for pyroomacoustics-0.1.18-cp36-cp36m-win32.whl
Algorithm Hash digest
SHA256 b1e4d52612ff79533f38a8ca865bf493aa4d35e1d80a0df561e19a72d189353b
MD5 b6fd6c78a89ce4211e611ee2926d8094
BLAKE2b-256 da027e0172e0dfbd611911fc26e4350a79fa4402cb87d69ec92a59a532b8f255

See more details on using hashes here.

File details

Details for the file pyroomacoustics-0.1.18-cp35-cp35m-win_amd64.whl.

File metadata

File hashes

Hashes for pyroomacoustics-0.1.18-cp35-cp35m-win_amd64.whl
Algorithm Hash digest
SHA256 5eaffeb891dc3e72d341ad77e743e8f45c7b5588f8c67a39897180ee9c886b15
MD5 1d9dc03c07a79df6954bb7f6561a9225
BLAKE2b-256 df5d9e79f8c7a060f7ad6cf2f7301c1d9b311a707ca1082bc8d076648ae21d2b

See more details on using hashes here.

File details

Details for the file pyroomacoustics-0.1.18-cp35-cp35m-win32.whl.

File metadata

File hashes

Hashes for pyroomacoustics-0.1.18-cp35-cp35m-win32.whl
Algorithm Hash digest
SHA256 d66b1261b1f86bf503c65774f0487784a6f1817c4f3face38ace16aadc55bdce
MD5 f792b6be7a3fc917658f08b88affeeea
BLAKE2b-256 f3b00c223952a30368c8b47d1165bd9750460b0e2fb0456f70155ba4fe3e2638

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page