Skip to main content

AudioSample is an optimized numpy-like audio manipulation library, created for researchers, used by developers.

Project description

AudioSample

by deepdub.ai

AudioSample is an optimized numpy-like audio manipulation library, created for researchers, used by developers.

It is an advanced audio manipulation library designed to provide researchers and developers with efficient, numpy-like tools for audio processing. It supports complex audio operations with ease and offers a familiar syntax for those accustomed to numpy.

AudioSample is perfect for data loading and ETLs, because its fast and has a low memory footprint due to lazy actions.

Features

  • Seamless Audio Operations: Perform a wide range of audio manipulations, including mixing, filtering, and transformations.
  • Integration with Numpy: Leverage numpy's syntax and capabilities for intuitive audio handling.
  • Integration with Torch: Export audio directly to and from torch tensors.
  • High Performance: Optimized for speed and efficiency, suitable for research and production environments. Most actions are lazy, so no operation done until absolutely necessary.
  • Extensive I/O Support: Easily read from and write to various audio formats. Utilizes PyAv - to support multiple ranges.

Release notes 2.2.1

  • Support up to numpy 2.2.0
  • Streaming input, streaming output:
    • AudioSample now supports receiving a python generator for input Generator[Union[bytes,numpy,AudioSample]]
    • Warning: It currently still stores everything in memory so this can't live forever.
    • Plugin functionality is not supported in stream mode.
    • streaming mode requires PyAV (See example below):
  • Constructor supports numpy buffers (same as calling AudioSample.from_numpy use force_read_sample_rate to set sample rate.)

Installation

To install AudioSample, use pip:

to install all prerequisites:

pip install audiosample[all] 
#linux/WSL:
pip install audiosample[all] 

#Possible extras are:
[av] - only av
[torch] - add torch
[tests] - include everything for tests.
[noui] - install without jupyter support.

#Mac OS:
brew install portaudio
#linux/WSL:
apt-get install portaudio19-dev
[play] - bare, with ability to play audio in console. (uses pyaudio)

Usage

Here's a quick example of how to load, process, and save audio using AudioSample:

import audiosample as ap
import numpy as np

# Create a 1 second audio sample with 44100 samples per second and 2 channels
au = ap.AudioSample.from_numpy(np.random.rand(2, 48000), rate=48000)
beep = ap.AudioSample().beep(1).to_stereo()
out = au.gain(-12) * beep
out.write("beep_with_overlayed_noise.mp3")
out = au.gain(-10) + au.silence(1) + beep
out.write("noise_then_silence_then_beep.mp3")

Additional Operations

  • Resampling: Fast resampling of audio.
  • Normalization: Easily normalize audio levels.
  • Mixing: Easily mix multiple audio sources together. Using * sign
  • Concat Easily concat audio sources. Using + sign
  • Playback: Play audio directly in Jupyter notebooks or from the command line.

Documentation

Bench Marks

AudioSample outperforms PyDub

open concatenation and save.

  • longbeep is a 100s long wav file of beep.
import pydub
from audiosample import AudioSample
def test_audiosample():
    au = AudioSample()
    for i in range(0, 100):
        au += AudioSample("longbeep.wav")[50:51]
    au.write("out.wav")

def test_pydub():
    au = pydub.AudioSegment.empty()
    for i in range(0, 100):
        au += pydub.AudioSegment.from_file("longbeep.wav")[50:51]
    au.export("out.wav")

%timeit test_audiosample()
#52.9 ms ± 1.89 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit test_pydub()
#376 ms ± 15.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

AudioSample mix vs. PyDub overlay

def test_audiosample():
    au = AudioSample().silence(1)
    for i in range(0, 100):
        au *= AudioSample("longbeep.wav")[50:51]
    au.write("out.wav")
def test_pydub():
    au = pydub.AudioSegment.silent(1)
    for i in range(0, 100):
        au = au.overlay(pydub.AudioSegment.from_file("longbeep.wav")[50:51], 0)
    au.export("out.wav")

In [3]: %timeit test_audiosample()
12.7 ms ± 265 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [4]: %timeit test_pydub()
398 ms ± 26.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

AudioSample outperforms SoundFile

verylongbeep.wav - is a 3200s file. (293M)

import soundfile as sf
from audiosample import AudioSample

def test_audiosample():
    out = AudioSample("verylongbeep.wav")[1500:1501].as_numpy()

def test_soundfile():
    with sf.SoundFile("verylongbeep.wav") as f:
        f.seek(48000*1500)
        out = f.read(48000)

In [5]: %timeit test_audiosample()
35.8 μs ± 1.69 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
In [6]: %timeit test_soundfile()
140 μs ± 8.89 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

For detailed instructions and API references, type help(AudioSample)

Examples

Explore the examples notebook to see practical applications of AudioSample in action.

Streaming code example below:


def chunkify(buffer: bytes):
    CHUNK_SIZE = 1000
    for i in range(0, len(buffer), CHUNK_SIZE):
        yield buffer[i:i+CHUNK_SIZE]

testmp3 = open('test.mp3','rb').read()

collect = b''
for chunk in AudioSample(chunkify(testmp3), force_read_sample_rate=48000, force_sample_rate=8000).as_data_stream(force_out_format='mulaw')
     collect += chunk

open('test.mulaw','wb').write(chunk)

License

AudioSample is released under the MIT License.

Contributing

Contributions are welcome! Please follow the contributing guidelines to submit changes.

About Deepdub

AudioSample is developed by Deepdub, a company specializing in AI-driven audio solutions. Deepdub focuses on enhancing media experiences through cutting-edge technology, enabling content creators to reach global audiences with high-quality, localized audio.

Support

If you have questions or need help, please open an issue on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

audiosample-2.2.11.tar.gz (48.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

audiosample-2.2.11-py3-none-any.whl (41.5 kB view details)

Uploaded Python 3

File details

Details for the file audiosample-2.2.11.tar.gz.

File metadata

  • Download URL: audiosample-2.2.11.tar.gz
  • Upload date:
  • Size: 48.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.10

File hashes

Hashes for audiosample-2.2.11.tar.gz
Algorithm Hash digest
SHA256 c7545b628e7ad3d19b2f6acc4ef047d86364539dbdd0987300058d31471d5462
MD5 70860336856035844fc18b44be1f4fa2
BLAKE2b-256 c11ba0dbc16b9075d310f853f4968894a9d6912edae66ecfc969f1ea0e286516

See more details on using hashes here.

File details

Details for the file audiosample-2.2.11-py3-none-any.whl.

File metadata

  • Download URL: audiosample-2.2.11-py3-none-any.whl
  • Upload date:
  • Size: 41.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.10

File hashes

Hashes for audiosample-2.2.11-py3-none-any.whl
Algorithm Hash digest
SHA256 2aaffc39a4138b5a086068b7dbf3532730efef7f54bf3fce9cfccfe1d98a2d22
MD5 44113ac98fc24da52c37239b5d7e6f53
BLAKE2b-256 c06bf337233fbfc9701dc4c0cfd52c7cc411a4ffdd119a5d9111a7887231c9a7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page