Audio processing
Project description
ProcessAudio
Libreria python para hacer data augmentation en audios y/o extraer caracteristicas a audios
Installation
pip install ProcessAudio
Description
A ProcessAudio
object should be created and use its attributes.
This library have tree main functions:
Features
: Extract features from audioAudioAugmentation
: Augment audio in different waysAllDataAugmentation
: Augment audio in different ways and extract featuresUtil
: Read audio and denoise audioSplit
: Split audio in n seconds or at desired cut pointsGraph
: Graph spectrogram or log_mel_spectrogram for an audio
Features methods
set_data(data_audio:str="<path_audio_file>)
: Set data to extract featuresget_croma()
: Extract croma featuresget_mfcc()
: Extract mfcc featuresget_rmse()
: Extract rmse featuresget_centroide_espectral()
: Extract spectral centroid featuresget_rolloff()
: Extract spectral rolloff featuresget_cruce_por_cero()
: Extract zero crossing rate featuresget_ancho_banda_espectral()
: Extract spectral bandwidth featuresget_tonnetz()
: Extract tonnetz featuresbuild_basic()
: Extract a basic features in a list
AudioAugmentation methods
loudness()
: Apply loudness to audio file creating a new dataadd_mask()
: Apply mask to audio file creating a new datapitch()
: Apply pitch to audio file creating a new dataget_original()
: Get original audio fileadd_crop()
: Apply crop to audio file creating a new dataadd_noise()
: Apply noise to audio file creating a new dataadd_noise2()
: Apply noise to audio file creating a new datashift()
: Apply shift to audio file creating a new datastretch()
: Apply stretch to audio file creating a new dataspeed()
: Apply speed to audio file creating a new datanormalizer()
: Apply normalizer to audio file creating a new datapolarizer()
: Apply polarizer to audio file creating a new datawrite_audio_file()
: Write audio fileplot_time_series()
: Plot time series
AllDataAugmentation methods
build_all(extract_features: bool)
: Augment audio and extract features if extract_features is True
Util methods
read_audio(file_path: str, force_convert_wav: bool)
: Read Read audio, if the format isn't wav, the method convert that before to readaudio_convert_wav(audio_path: str, output_path: str)
: Convert audio to wav formatdenoise_audio(data: np.array, sr: int)
: remove the noise of audio data
Split methods
split(self, start: int, end: int, save: bool)
: Split audio in start to end, if you need seconds, start and end have to multiples of 1000split_by_seconds(self, seconds: int, save: bool)
: Cut audio en segments of parameter and save each one
Graph methods
spectrogram(data: np.array, sr: int, output_path: str, title: str)
: Create the spectrogram for audio datalog_mel_spectrogram(data: np.array, sr: int, output_path: str, title: str)
: Create log mel spectrogram for audio data
Usage
Example Features
import os
from ProcessAudio.Features import Features
filepath = os.path.dirname(os.path.abspath(__file__)) + os.sep
path_file = filepath + "demo" + os.sep + "dat_92.wav"
features = Features()
features.set_data(path_file)
DATA = features.build_basic() # Extract all features
print(DATA)
print(len(DATA))
Example AudioAugmentation
import os
from ProcessAudio.AudioAugmentation import AudioAugmentation
filepath = os.path.dirname(os.path.abspath(__file__)) + os.sep
path_file = filepath + "demo" + os.sep + "dat_92.wav"
folder_save = filepath + "new_audios" + os.sep
aumentation = AudioAugmentation(audio_file=path_file, save=folder_save)
audio_con_ruido = aumentation.add_noise(factor_ruido=0.05)
audio_normalizer = aumentation.normalizer()
audio_loudness = aumentation.loudness()
Example AllDataAugmentation
import os
from ProcessAudio.AllDataAugmentation import AllDataAugmentation
filepath = os.path.dirname(os.path.abspath(__file__)) + os.sep
path_file = filepath + "demo" + os.sep + "dat_92.wav"
folder_save = filepath + "new_audios" + os.sep
aumentation = AllDataAugmentation(path_file, path_save=folder_save, label=["cero", "uno"])
data, label = aumentation.build_all(extract_features=True)
print(len(data), len(label))
print(len(data[0]), label[0])
Citing
If you want to cite ProcessAudio in an academic paper, there are two ways to do it.
-
APA:
WISROVI, W.S.R.V. (2022). Python library to augment audio data and/or extract audio features (Version 0.22.11) [Computer Software]. https://github.com/wisrovi/ProcessAudio
-
BibTex:
@software{WISROVI_Instrument_Classifier_2022, author = {WISROVI, William Steve Rodríguez Villamizar}, month = {10}, title = {{Python library to augment audio data and/or extract audio features}}, URL = {https://github.com/wisrovi/ProcessAudio}, version = {0.22.11}, year = {2022} }
License
GPLv3 License
Support:
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.