A toolbox of audio models and algorithms based on MindSpore.
Project description
Introduction
MindAudio is a toolbox of audio models and algorithms based on MindSpore. It provides a series of API for common audio data processing,data enhancement,feature extraction, so that users can preprocess data conveniently. Also provides examples to show how to build audio deep learning models with mindaudio.
data processing
# read audio
>>> import mindaudio.data.io as io
>>> audio_data, sr = io.read(data_file)
# feature extraction
>>> import mindaudio.data.features as features
>>> feats = features.fbanks(audio_data)
Installation
Install with PyPI
The released version of MindAudio can be installed via PyPI
as follows:
pip install mindaudio
Install from Source
The latest version of MindAudio can be installed as follows:
git clone https://github.com/mindspore-lab/mindaudio.git
cd mindaudio
pip install -r requirements/requirements.txt
python setup.py install
Get started with audio data analysis
mindaudio provides a series of commonly used audio data processing apis, which can be easily invoked for data analysis and feature extraction.
>>> import mindaudio.data.io as io
>>> import mindaudio.data.spectrum as spectrum
>>> import numpy as np
>>> import matplotlib.pyplot as plt
# read audio
>>> audio_data, sr = io.read("./tests/samples/ASR/BAC009S0002W0122.wav")
# feature extraction
>>> n_fft = 512
>>> matrix = spectrum.stft(audio_data, n_fft=n_fft)
>>> magnitude, _ = spectrum.magphase(matrix, 1)
# display
>>> x = [i for i in range(0, 256*750, 256)]
>>> f = [i/n_fft * sr for i in range(0, int(n_fft/2+1))]
>>> plt.pcolormesh(x,f,magnitude, shading='gouraud', vmin=0, vmax=np.percentile(magnitude, 98))
>>> plt.title('STFT Magnitude')
>>> plt.ylabel('Frequency [Hz]')
>>> plt.xlabel('Time [sec]')
>>> plt.show()
Result presentation:
What's New
- 2023/06/24: version 0.1.1, bug fix and readme update
- 2023/03/30: version 0.1.0, including 50+ data processing APIs, 5 models supported.
- 2022/09/30: beta, 33 data APIs + 3 models
Contributing
We appreciate all contributions to improve MindSpore Audio. Please refer to CONTRIBUTING.md for the contributing guideline.
License
This project is released under the Apache License 2.0.
Citation
If you find this project useful in your research, please consider citing:
@misc{MindSpore Audio 2022,
title={{MindSpore Audio}:MindSpore Audio Toolbox and Benchmark},
author={MindSpore Audio Contributors},
howpublished = {\url{https://github.com/mindspore-lab/mindaudio}},
year={2022}
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file mindaudio-0.3.0.tar.gz
.
File metadata
- Download URL: mindaudio-0.3.0.tar.gz
- Upload date:
- Size: 125.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.18
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 34896941f1f163739e64a588c9070ec7253056f8172a68eaccecaf3efbaa14ba |
|
MD5 | 5fde9fc5dcb2b11e71e10e615e2b2177 |
|
BLAKE2b-256 | a24826a6fdce982e412f16c6b453eb973680bc9aa6aef15452a4fbe407eb27cd |
File details
Details for the file mindaudio-0.3.0-py3-none-any.whl
.
File metadata
- Download URL: mindaudio-0.3.0-py3-none-any.whl
- Upload date:
- Size: 144.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.18
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5e24f2f84f17be1588beaa65278c6b9c48c34277912060285de9c125de4cfa67 |
|
MD5 | 97320ef378402b3f00fc86ea596203a4 |
|
BLAKE2b-256 | 0b082651ef3dbf670e298ee785d7c915f94893d571e62c7b030e2f7e9de66ebf |