Audio processing
Project description
ProcessAudio
Libreria python para hacer data augmentation en audios y/o extraer caracteristicas a audios
Installation
pip install ProcessAudio
Description
A ProcessAudio
object should be created and use its attributes.
This library have tree main functions:
Features
: Extract features from audioAudioAugmentation
: Augment audio in different waysAllDataAugmentation
: Augment audio in different ways and extract featuresUtil
: Read audio and denoise audioSplit
: Split audio in n seconds or at desired cut pointsGraph
: Graph spectrogram or log_mel_spectrogram for an audio
Features methods
set_data(data_audio:str="<path_audio_file>)
: Set data to extract featuresget_croma()
: Extract croma featuresget_mfcc()
: Extract mfcc featuresget_rmse()
: Extract rmse featuresget_centroide_espectral()
: Extract spectral centroid featuresget_rolloff()
: Extract spectral rolloff featuresget_cruce_por_cero()
: Extract zero crossing rate featuresget_ancho_banda_espectral()
: Extract spectral bandwidth featuresget_tonnetz()
: Extract tonnetz featuresbuild_basic()
: Extract a basic features in a list
AudioAugmentation methods
loudness()
: Apply loudness to audio file creating a new dataadd_mask()
: Apply mask to audio file creating a new datapitch()
: Apply pitch to audio file creating a new dataget_original()
: Get original audio fileadd_crop()
: Apply crop to audio file creating a new dataadd_noise()
: Apply noise to audio file creating a new dataadd_noise2()
: Apply noise to audio file creating a new datashift()
: Apply shift to audio file creating a new datastretch()
: Apply stretch to audio file creating a new dataspeed()
: Apply speed to audio file creating a new datanormalizer()
: Apply normalizer to audio file creating a new datapolarizer()
: Apply polarizer to audio file creating a new datawrite_audio_file()
: Write audio fileplot_time_series()
: Plot time series
AllDataAugmentation methods
build_all(extract_features: bool)
: Augment audio and extract features if extract_features is True
Util methods
read_audio(file_path: str, force_convert_wav: bool)
: Read Read audio, if the format isn't wav, the method convert that before to readaudio_convert_wav(audio_path: str, output_path: str)
: Convert audio to wav formatdenoise_audio(data: np.array, sr: int)
: remove the noise of audio data
Split methods
split(self, start: int, end: int, save: bool)
: Split audio in start to end, if you need seconds, start and end have to multiples of 1000split_by_seconds(self, seconds: int, save: bool)
: Cut audio en segments of parameter and save each one
Graph methods
spectrogram(data: np.array, sr: int, output_path: str, title: str)
: Create the spectrogram for audio datalog_mel_spectrogram(data: np.array, sr: int, output_path: str, title: str)
: Create log mel spectrogram for audio data
Usage
Example Features
import os
from ProcessAudio.Features import Features
filepath = os.path.dirname(os.path.abspath(__file__)) + os.sep
path_file = filepath + "demo" + os.sep + "dat_92.wav"
features = Features()
features.set_data(path_file)
DATA = features.build_basic() # Extract all features
print(DATA)
print(len(DATA))
Example AudioAugmentation
import os
from ProcessAudio.AudioAugmentation import AudioAugmentation
filepath = os.path.dirname(os.path.abspath(__file__)) + os.sep
path_file = filepath + "demo" + os.sep + "dat_92.wav"
folder_save = filepath + "new_audios" + os.sep
aumentation = AudioAugmentation(audio_file=path_file, save=folder_save)
audio_con_ruido = aumentation.add_noise(factor_ruido=0.05)
audio_normalizer = aumentation.normalizer()
audio_loudness = aumentation.loudness()
Example AllDataAugmentation
import os
from ProcessAudio.AllDataAugmentation import AllDataAugmentation
filepath = os.path.dirname(os.path.abspath(__file__)) + os.sep
path_file = filepath + "demo" + os.sep + "dat_92.wav"
folder_save = filepath + "new_audios" + os.sep
aumentation = AllDataAugmentation(path_file, path_save=folder_save, label=["cero", "uno"])
data, label = aumentation.build_all(extract_features=True)
print(len(data), len(label))
print(len(data[0]), label[0])
Citing
If you want to cite ProcessAudio in an academic paper, there are two ways to do it.
-
APA:
WISROVI, W.S.R.V. (2022). Python library to augment audio data and/or extract audio features (Version 0.22.11) [Computer Software]. https://github.com/wisrovi/ProcessAudio
-
BibTex:
@software{WISROVI_Instrument_Classifier_2022, author = {WISROVI, William Steve Rodríguez Villamizar}, month = {10}, title = {{Python library to augment audio data and/or extract audio features}}, URL = {https://github.com/wisrovi/ProcessAudio}, version = {0.22.11}, year = {2022} }
License
GPLv3 License
Support:
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file ProcessAudio-0.24.5.tar.gz
.
File metadata
- Download URL: ProcessAudio-0.24.5.tar.gz
- Upload date:
- Size: 21.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f8e6c2861bc267dd94185b54dd796cb9b6ebd89a85659a62e7b6552f37f84476 |
|
MD5 | 0e37ccd9cbd7e16de341e50d8f156304 |
|
BLAKE2b-256 | 327cd7bc693c3445c34b5b48a6f4ae4792eb6a972b9846a8ebfc674402ca2ed9 |