A package for crawling and processing audio, caption from Youtube
Project description
Audio, Caption Crawler and Processor -TTS Data Generator-
Downloads and processes the audios and captions(subtitles) from Youtube videos for Speech AI
Generates audio datas from Youtube for TTS
Requirements
- Currently requires python == 3.6
- FFmpeg
- youtube_dl
- pydub
- youtube_transcript_api
To Use
pip3 install vctube
from vctube import VCtube
playlist_name=""
playlist_url = ""
lang = "" #ex) ko, en, fr, de...
vc = VCtube(playlist_name, playlist_url, lang)
vc.download_audio() #download audios from youtube
vc.download_captions() #download captions from youtube
vc.audio_split() #split audio with captions
Results
datasets
|- playlist name
|- metadata.csv
|- alignment.json
|- wavs
├── 1.wav
├── 2.wav
├── 3.wav
└── ...
and metadata.csv
should look like:
{
"0001.wav|그래서 사람들도 날 핍이라고 불렀다.",
"0002.wav|크리스마스 덕분에 부엌에 먹을게 가득했다.",
"0003.wav|조가 자신이 그 사람이라고 나섰다.",
...
}
and alignment.json
should look like:
{
"./datasets/playlist name/wavs/0001.wav": "그래서 사람들도 날 핍이라고 불렀다.",
"./datasets/playlist name/wavs/0002.wav": "크리스마스 덕분에 부엌에 먹을게 가득했다.",
"./datasets/playlist name/wavs/0003.wav": "조가 자신이 그 사람이라고 나섰다.",
...
}
Pypi address
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
vctube-1.3.tar.gz
(7.8 kB
view details)
Built Distribution
vctube-1.3-py3-none-any.whl
(8.7 kB
view details)
File details
Details for the file vctube-1.3.tar.gz
.
File metadata
- Download URL: vctube-1.3.tar.gz
- Upload date:
- Size: 7.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.6.1 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d9f099b59272640f34015b2c1863d27a17da452b0608dd1fa79f645d9bd0f479 |
|
MD5 | 7515a39f6dea9604e4f17be24e0e34dd |
|
BLAKE2b-256 | e940ecc7d9c03bb889bf37e25391f2e3a9c42ef2282f03d9262686cd7cb131c4 |
File details
Details for the file vctube-1.3-py3-none-any.whl
.
File metadata
- Download URL: vctube-1.3-py3-none-any.whl
- Upload date:
- Size: 8.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.6.1 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6478846f0db973b19722a90eb1369f3b39c57b7addc4e2a5a566a5468ba35433 |
|
MD5 | 10f6a5591e163243fce0a24f2aa07c48 |
|
BLAKE2b-256 | 24eae88cb29365871b2e497949ff801e2668500e26d21eefc04f7e117df39733 |