Audioperm, a python library for generating different permutations of audible segments from audio files.
Project description
Audioperm, a python library for generating different permutations of audible segments from audio files.
Audioperm
A python library for generating different permutations of audible segments from audio files.
pip install audioperm
Use:
- Silence Removal from Audio
- Audio / Speech augmentation
- Word segmentation
- Word level permutation generation
- Add new synthetic data for deep learning
- Speaker recognition, Speaker verification, Audio classification, Audio fingerprinting
Documentation: https://zabir-nabil.github.io/audioperm/
Source Code: https://github.com/zabir-nabil/audioperm
Word segmentation
from audioperm import AudioPerm
from audioperm.utils import save_audio
ap = AudioPerm("i_love_cats.m4a")
label = "i love cats"
words = ap.word_segments()
label_words = label.split()
for i, w in enumerate(words):
save_audio(w, label_words[i] + ".wav")
cats.wav i_love_cats.m4a i.wav love.wav
Word-level permutation
import numpy as np
from audioperm import AudioPerm
from audioperm.utils import save_audio
ap = AudioPerm("i_love_cats.m4a")
ap.word_segments(return_words=False)
perm_sentences = ap.permute(n_permutations = 5)
for i, s in enumerate(perm_sentences):
save_audio(s, f"perm_{i}.wav")
cats.wav i.wav perm_1.wav perm_4.wav
i_love_cats.m4a love.wav perm_2.wav perm_0.wav
perm_3.wav
permutations
on multiple files
from audioperm import read_audio, word_segments, permutations
ap = read_audio(["bangla_demo.wav", "i_love_cats.m4a"])
out = word_segments(ap)
perms = permutations(out, n_permutations = 5)
Fixed-length segments
- Generate fixed length audible segments (with permutation/augmentation)
from audioperm import fixed_len_segments
fixed_len_segments("bangla_demo.wav", return_segments = False, save_path = "fls_out", save = True, segment_size = 0.5)
out = fixed_len_segments("bangla_demo.wav", return_segments = True, max_segments = 5, permute = True, save = False, segment_size = 0.5)
Others
To run the code: Google Colab
Any contribution is welcome.
Tested with:
python3.7
python3.8
Internal audio representation:
PCM 16
float32
TO-DO:
- multi-channel audio
- augmentation
- multi-processing
- gpu-support
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
audioperm-0.0.5.tar.gz
(6.9 kB
view details)
Built Distribution
File details
Details for the file audioperm-0.0.5.tar.gz
.
File metadata
- Download URL: audioperm-0.0.5.tar.gz
- Upload date:
- Size: 6.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.6.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 446842ff423197f255e009628ffb58db2485d9509c6761a351b2769651bc737f |
|
MD5 | d70c343789b586c9001282be7fec7ad6 |
|
BLAKE2b-256 | 8850b59cc5f3fcb14c2316d6cc5da2cc1f1bf9b83f7f3cd0d0e7405a4235c0bc |
File details
Details for the file audioperm-0.0.5-py3-none-any.whl
.
File metadata
- Download URL: audioperm-0.0.5-py3-none-any.whl
- Upload date:
- Size: 8.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.6.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cc4a48a7aeb81f3900c5851fa1c849572d087a021872dea4b8f3a9a701c06aaf |
|
MD5 | 05fcaa509ce0c9a81f654dd16eea251e |
|
BLAKE2b-256 | e624f5fdaf518f68d1ad1363c3d6a8989581d16145a60977cfa4a933a3af3ad8 |