An audio classification toolkit based on PaddlePaddle for detecting abnormal sounds
Project description
ppaudio
基于 PaddlePaddle 的简单音频分类库。
功能特点
- 简单易用的音频分类 API
- 支持多种音频分类场景:
- 工业噪声检测
- 风扇噪声分类
- 发动机声音分析
- 动物声音分类
- 音乐流派分类
- 多种特征提取方法
- 丰富的神经网络架构
- 音频分析可视化工具
安装
pip install ppaudio
快速开始
-
准备 CSV 格式的数据:
- train_data.csv(训练数据)
- val_data.csv(验证数据)
- test_data.csv(测试数据)
每个 CSV 文件应包含两列:音频文件路径和标签(0 或 1)。
-
创建配置文件(例如
motor.yaml):
# 数据集参数
dataset_conf:
dataset:
adjust_audio_length: True
use_dB_normalization: True
is_full_path: False
sampler:
batch_size: 64
shuffle: True
drop_last: True
train_data: 'dataset/train_data.csv'
val_data: 'dataset/val_data.csv'
test_data: 'dataset/test_data.csv'
# 预处理参数
preprocess_conf:
feature_method: 'LogMelSpectrogram'
method_args:
sr: 48000
n_fft: 1024
hop_length: 512
win_length: 1024
window: 'hann'
f_min: 50
f_max: 14000
n_mels: 64
# 损失函数配置
loss_conf:
criterion: 'CrossEntropyLoss'
# 优化器配置
optimizer_conf:
optimizer: 'Adam'
optimizer_args:
learning_rate: 0.001
# 系统配置
sys_conf:
use_GPU: True
max_epoch: 60
show_train_process: True
save_train_process: False
model_save_name: 'model'
- 训练模型:
import ppaudio
# 训练模型
ppaudio.train(config_path='motor.yaml')
- 测试模型:
# 在数据集上测试
ppaudio.test('model', 'dataset/test_data.csv')
# 测试单个音频文件
ppaudio.test_single('model', '123.wav')
# 比较两个音频文件
ppaudio.compare('model', '123.wav', '456.wav', show_pic=True)
数据标注工具
ppaudio 提供了便捷的数据标注功能,可以根据文件名自动生成标签:
from ppaudio.core import label_dataset
# 标注数据集
label_dataset(
root_dir='your_audio_folder', # 音频文件夹路径
output_path='labels.csv', # 输出CSV文件路径
ok_pattern='OK', # OK样本的文件名模式
ng_pattern='NG', # NG样本的文件名模式
ok_label=1, # OK样本的标签值
ng_label=0 # NG样本的标签值
)
使用示例
- 工业噪声检测:
import ppaudio
# 标注数据
ppaudio.core.label_dataset('audio_samples/', 'dataset.csv')
# 训练模型
ppaudio.train(config_path='motor.yaml')
# 测试新样本
ppaudio.test_single('model', 'new_sample.wav')
- 音乐流派分类:
import ppaudio
# 训练模型(使用不同的配置)
ppaudio.train(config_path='music.yaml')
# 比较两个音乐样本
ppaudio.compare('model', 'rock.wav', 'jazz.wav', show_pic=True)
贡献
欢迎提交 Pull Request 来帮助改进这个项目!
许可证
本项目采用 MIT 许可证 - 详见 LICENSE 文件。
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ppaudio-0.1.2.tar.gz
(29.4 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ppaudio-0.1.2.tar.gz.
File metadata
- Download URL: ppaudio-0.1.2.tar.gz
- Upload date:
- Size: 29.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dbacdc68fb9c39eb39614bac0294e850b1d100828114a5fb34d3a65774d28fba
|
|
| MD5 |
2b4b7f7bed64d8398cc4da319afd982e
|
|
| BLAKE2b-256 |
011d23207b44ec4f986b45f59bb8aed45f12d41bbd91d87705bb073875a66648
|
File details
Details for the file ppaudio-0.1.2-py2.py3-none-any.whl.
File metadata
- Download URL: ppaudio-0.1.2-py2.py3-none-any.whl
- Upload date:
- Size: 34.4 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4f50fba478e7d3498b88cae604f2b0e3d669606c4c003a23202d3808f7533a96
|
|
| MD5 |
8ab7c77bec1e0528fb8d6dc620c3dd70
|
|
| BLAKE2b-256 |
f4b83163d95826295f172029f4754c8b2984a21702cac26f500850dfce2d5255
|