ValidDataSet - TTS Lj Speech Dataset Validator
Project description
ValidDataSet
About [ Menu ]
ValidDataSet
was created to help validate datasets created based on the Lj Speech Dataset (for Tacotron, Flowtron, Waveglow, or RadTTS).
VDS
works based on plugins (which can be dynamically added by the user in the future).
Descriptions of current plugins can be found in the Plugins section.
Plugins [ Menu ]
Below is a list of currently used plugins (new ones will be added over time).
ID | Name | Version | Description |
---|---|---|---|
F001 | WavsTranscriptionChecker | 23.3.9 | Check if all files have been added to the transcription files |
F002 | WavPropertiesChecker | 23.3.9 | Check if all files are mono, 22050 Hz with length between 2 and 10 seconds |
T001 | DatasetStructureChecker | 23.3.9 | Check if the "wavs" folder and transcription files exist in the dataset |
T002 | EmptyLineChecker | 23.3.9 | Check if there are empty lines in the transcriptions |
T003 | FilesInTranscriptionChecker | 23.3.9 | Check if all files added to transcription exist |
T004 | ExistingWavFileTranscriptionChecker | 23.3.9 | Check if all files added to transcription have a transcription |
T005 | PunctuationMarksChecker | 23.3.9 | Check if all transcriptions end with punctuation marks: ".", "?" or "!" |
T006 | PunctuationMarksChecker | 23.3.9 | Check if all lines have the same number of PIPE characters |
T007 | DuplicatedTranscriptionChecker | 23.3.9 | Check if there are any duplicate paths to WAV files in the transcriptions |
Installation [ Menu ]
To install ValidDataSet, use the following command:
pip install vds
Usage [ Menu ]
Command in Linux: vds or vds-win
Command in Windows: vds-win
List of parameters supported by VDS:
-v, --verbose Print additional information
-o, --output Save output to file
--plugins.list List plugins
--plugins.disable List of plugins to disable like: F001,T002,T006
--args.path Path to dataset
--args.files Set transcription file names like: train.txt,val.txt
--args.dir-name wavs folder name (default: wavs)
--args.sample-rate Set sample rate (default: 22050)
--args.number-of-channels Set number of channels (default: 1 [mono])
--args.min-duration Set minimum duration in miliseconds (1000 ms = 1 second)
--args.max-duration Set maximum duration in miliseconds (1000 ms = 1 second)
--args.number-of-pipes Set number of pipes (|) (default: 1)
Sample commands and their description:
List all plugins:
vds --plugins.list
Run VDS
with all plugins without additional information:
vds --args.path /media/username/Disk/Dataset_name/
Run VDS
with all plugins with additional information:
vds --args.path /media/username/Disk/Dataset_name/ -v
Run VDS
without plugins F001,T002,T006 with additional information:
vds --args.path /media/username/Disk/Dataset_name/ --plugins.disable F001,T002,T006 -v
Run VDS
without plugins F001,T002,T006 with own transcription names and with additional information:
vds --args.path /media/username/Disk/Dataset_name/ --plugins.disable F001,T002,T006 --args.files train.txt,val.txt -v
Run VDS
and print files which are longer than 20 seconds, shorter than 2 seconds and not in mono:
vds --args.path /media/username/Disk/Dataset_name/ --args.min-duration 2000 --args.max-duration 20000 --args.number-of-channels 2 -v
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file vds-23.3.9.tar.gz
.
File metadata
- Download URL: vds-23.3.9.tar.gz
- Upload date:
- Size: 10.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.3.1 CPython/3.10.9 Linux/6.1.0-5-amd64
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | dc8ab3dce2a25a5ad80dad3bdd6188d8bd4f973e0db629a5334d3bab94d98a9b |
|
MD5 | ae087a26768638654bb82d8117230a67 |
|
BLAKE2b-256 | 24f2ec6e61ce398f9827a10b478eff53fc4d3a27204da535321dfd83c87fcbd5 |
File details
Details for the file vds-23.3.9-py3-none-any.whl
.
File metadata
- Download URL: vds-23.3.9-py3-none-any.whl
- Upload date:
- Size: 15.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.3.1 CPython/3.10.9 Linux/6.1.0-5-amd64
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 352d7f28c3f86108f39440ce68dd58d7087d28bf210faa9d015405101573e545 |
|
MD5 | 6c0cd1370d9767761486742027f63194 |
|
BLAKE2b-256 | 35524ac4c12815c18510264ae61dac7a50102c076c324eb6a15b9441b7aaaaea |