A toolbox for audio dataset processing and augmentation.
Project description
Datasets Toolbox
A toolbox for creating, processing and inspecting audio/image datasets through a simple CLI interface.
Installation
pip install datasets-toolbox
Usage
The goal of datasets-toolbox is to build audio/image datasets with CLI.
All the commands support --config [config-name]
and --split [split-name]
options to specified the target. Where config-name
is the configuration name (e.g. language) and split-name
is something like train
, validation
, test
.
Add More Data
datasets import --config [data] --split [train] <sources>
Import data into datasets structure.
If the configuration/split is not configured, will defaults to default
configuration and train
split.
Modify Dataset
datasets modify <action> --config [data] --split [train] --other-params
If the configuration/split is not configured, will defaults to recursively run on all configurations and all splits.
Audio Slicer
datasets modify slice --config [data] --split [train] --min-length [ms] --hop-size [n]
Audio Resample
datasets modify resample --config [data] --split [train] --sr [16000] --mono
Audio Transcription
datasets modify transcribe --model [openai/whisper-large-v3-turbo]'
Inspect Dataset
datasets inspect --config [data] --split [train] --other-params
If the configuration/split is not configured, will defaults to recursively run on all configurations and all splits.
Audio Hours
datasets inspect hours --config [data] --split [train]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file datasets_toolbox-0.1.0.tar.gz
.
File metadata
- Download URL: datasets_toolbox-0.1.0.tar.gz
- Upload date:
- Size: 9.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c059d470f2472631d329658b50d01012290e5ba9f98e6af0100f38c9b0593ab2 |
|
MD5 | 12fc22f999f334cabfe7d4096357e912 |
|
BLAKE2b-256 | 9705a302628710e8e302f89230aec50975ec6276facee1ba205af2b3e8c5f833 |
File details
Details for the file datasets_toolbox-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: datasets_toolbox-0.1.0-py3-none-any.whl
- Upload date:
- Size: 13.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0f29df8f68962204096038a0183d3bb1005869ca0db0d5ecc7091be445531bff |
|
MD5 | 2810ddfca41ab63aadb45e0e4913c617 |
|
BLAKE2b-256 | 779561288c2bda302d603260e410704042518437acf0a2dc5f0206b4ae0cace3 |