Skip to main content

Audioperm, a python library for generating different permutations of audible segments from audio files.

Project description

audioperm

Audioperm, a python library for generating different permutations of audible segments from audio files.

License Package version License Open In Colab


Audioperm

A python library for generating different permutations of audible segments from audio files.

pip install audioperm

Use:

  • Silence Removal from Audio
  • Audio / Speech augmentation
  • Word segmentation
  • Word level permutation generation
  • Add new synthetic data for deep learning
  • Speaker recognition, Speaker verification, Audio classification, Audio fingerprinting

Documentation: https://zabir-nabil.github.io/audioperm/

Source Code: https://github.com/zabir-nabil/audioperm


Word segmentation

from audioperm import AudioPerm
from audioperm.utils import save_audio

ap = AudioPerm("i_love_cats.m4a")
label = "i love cats"

words = ap.word_segments()
label_words = label.split()

for i, w in enumerate(words):
  save_audio(w, label_words[i] + ".wav")
cats.wav  i_love_cats.m4a  i.wav  love.wav

Word-level permutation

import numpy as np
from audioperm import AudioPerm
from audioperm.utils import save_audio

ap = AudioPerm("i_love_cats.m4a")
ap.word_segments(return_words=False)
perm_sentences = ap.permute(n_permutations = 5)

for i, s in enumerate(perm_sentences):
  save_audio(s, f"perm_{i}.wav")
cats.wav	   i.wav       perm_1.wav    perm_4.wav
i_love_cats.m4a    love.wav    perm_2.wav    perm_0.wav  
perm_3.wav

permutations on multiple files

from audioperm import read_audio, word_segments, permutations

ap = read_audio(["bangla_demo.wav", "i_love_cats.m4a"])
out = word_segments(ap)
perms = permutations(out, n_permutations = 5)

Fixed-length segments

  • Generate fixed length audible segments (with permutation/augmentation)
from audioperm import fixed_len_segments
fixed_len_segments("bangla_demo.wav", return_segments = False, save_path = "fls_out", save = True, segment_size = 0.5)
out = fixed_len_segments("bangla_demo.wav", return_segments = True, max_segments = 5, permute = True, save = False, segment_size = 0.5)

Others

To run the code: Google Colab

Any contribution is welcome.

Tested with:

  • python3.7
  • python3.8

Internal audio representation:

  • PCM 16
  • float32

TO-DO:

  • multi-channel audio
  • augmentation
  • multi-processing
  • gpu-support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

audioperm-0.0.5.tar.gz (6.9 kB view hashes)

Uploaded source

Built Distribution

audioperm-0.0.5-py3-none-any.whl (8.3 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page