Audioperm, a python library for generating different permutations of audible segments from audio files.
Project description
Audioperm, a python library for generating different permutations of audible segments from audio files.
Audioperm
A python library for generating different permutations of audible segments from audio files.
pip install audioperm
Use:
- Silence Removal from Audio
- Audio / Speech augmentation
- Word segmentation
- Word level permutation generation
- Add new synthetic data for deep learning
- Speaker recognition, Speaker verification, Audio classification, Audio fingerprinting
Documentation: https://zabir-nabil.github.io/audioperm/
Source Code: https://github.com/zabir-nabil/audioperm
Word segmentation
from audioperm import AudioPerm
from audioperm.utils import save_audio
ap = AudioPerm("i_love_cats.m4a")
label = "i love cats"
words = ap.word_segments()
label_words = label.split()
for i, w in enumerate(words):
save_audio(w, label_words[i] + ".wav")
cats.wav i_love_cats.m4a i.wav love.wav
Word-level permutation
import numpy as np
from audioperm import AudioPerm
from audioperm.utils import save_audio
ap = AudioPerm("i_love_cats.m4a")
ap.word_segments(return_words=False)
perm_sentences = ap.permute(n_permutations = 5)
for i, s in enumerate(perm_sentences):
save_audio(s, f"perm_{i}.wav")
cats.wav i.wav perm_1.wav perm_4.wav
i_love_cats.m4a love.wav perm_2.wav perm_0.wav
perm_3.wav
permutations on multiple files
from audioperm import read_audio, word_segments, permutations
ap = read_audio(["bangla_demo.wav", "i_love_cats.m4a"])
out = word_segments(ap)
perms = permutations(out, n_permutations = 5)
Fixed-length segments
- Generate fixed length audible segments (with permutation/augmentation)
from audioperm import fixed_len_segments
fixed_len_segments("bangla_demo.wav", return_segments = False, save_path = "fls_out", save = True, segment_size = 0.5)
out = fixed_len_segments("bangla_demo.wav", return_segments = True, max_segments = 5, permute = True, save = False, segment_size = 0.5)
Others
To run the code: Google Colab
Any contribution is welcome.
Tested with:
python3.7python3.8
Internal audio representation:
PCM 16float32
TO-DO:
- multi-channel audio
- augmentation
- multi-processing
- gpu-support
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file audioperm-0.0.5.tar.gz.
File metadata
- Download URL: audioperm-0.0.5.tar.gz
- Upload date:
- Size: 6.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.6.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
446842ff423197f255e009628ffb58db2485d9509c6761a351b2769651bc737f
|
|
| MD5 |
d70c343789b586c9001282be7fec7ad6
|
|
| BLAKE2b-256 |
8850b59cc5f3fcb14c2316d6cc5da2cc1f1bf9b83f7f3cd0d0e7405a4235c0bc
|
File details
Details for the file audioperm-0.0.5-py3-none-any.whl.
File metadata
- Download URL: audioperm-0.0.5-py3-none-any.whl
- Upload date:
- Size: 8.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.6.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cc4a48a7aeb81f3900c5851fa1c849572d087a021872dea4b8f3a9a701c06aaf
|
|
| MD5 |
05fcaa509ce0c9a81f654dd16eea251e
|
|
| BLAKE2b-256 |
e624f5fdaf518f68d1ad1363c3d6a8989581d16145a60977cfa4a933a3af3ad8
|