A toolbox for audio dataset processing and augmentation.
Project description
Datasets Toolbox
A toolbox for creating, processing and inspecting audio/image datasets through a simple CLI interface.
Installation
pip install datasets-toolbox
Usage
The goal of datasets-toolbox is to build audio/image datasets with CLI.
All the commands support --config [config-name] and --split [split-name] options to specified the target. Where config-name is the configuration name (e.g. language) and split-name is something like train, validation, test.
Add More Data
datasets import --config [data] --split [train] <sources>
Import data into datasets structure.
If the configuration/split is not configured, will defaults to default configuration and train split.
Modify Dataset
datasets modify <action> --config [data] --split [train] --other-params
If the configuration/split is not configured, will defaults to recursively run on all configurations and all splits.
Audio Slicer
datasets modify slice --config [data] --split [train] --min-length [ms] --hop-size [n]
Audio Resample
datasets modify resample --config [data] --split [train] --sr [16000] --mono
Audio Transcription
datasets modify transcribe --model [openai/whisper-large-v3-turbo]'
Inspect Dataset
datasets inspect --config [data] --split [train] --other-params
If the configuration/split is not configured, will defaults to recursively run on all configurations and all splits.
Audio Hours
datasets inspect hours --config [data] --split [train]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file datasets_toolbox-0.1.0.tar.gz.
File metadata
- Download URL: datasets_toolbox-0.1.0.tar.gz
- Upload date:
- Size: 9.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c059d470f2472631d329658b50d01012290e5ba9f98e6af0100f38c9b0593ab2
|
|
| MD5 |
12fc22f999f334cabfe7d4096357e912
|
|
| BLAKE2b-256 |
9705a302628710e8e302f89230aec50975ec6276facee1ba205af2b3e8c5f833
|
File details
Details for the file datasets_toolbox-0.1.0-py3-none-any.whl.
File metadata
- Download URL: datasets_toolbox-0.1.0-py3-none-any.whl
- Upload date:
- Size: 13.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0f29df8f68962204096038a0183d3bb1005869ca0db0d5ecc7091be445531bff
|
|
| MD5 |
2810ddfca41ab63aadb45e0e4913c617
|
|
| BLAKE2b-256 |
779561288c2bda302d603260e410704042518437acf0a2dc5f0206b4ae0cace3
|