A pakage for crawling and processing audio, caption from Youtube

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

Audio, Caption Crawler and Processor

Downloads and processes the audios and captions(subtitles) from Youtube videos for Speech AI

Requirements

Currently requires python >= 3.6
FFmpeg

To Use

  from accp import ACCP

  playlist_name=""
  playlist_url = ""

  accp = ACCP(playlist_name, playlist_url)
  accp.download_audio()    #download audio from youtube

  accp.download_caption()  #download captions from youtube

  accp.audio_split()       #split

Results

  datasets
    |- playlist name
        |- metadata.csv
        |- alignment.json
        |- wavs
             ├── 1.wav
             ├── 2.wav
             ├── 3.wav
             └── ...

and metadata.csv should look like:

{
    0001.wav|그래서 사람들도 날 핍이라고 불렀다.,
    0002.wav|크리스마스 덕분에 부엌에 먹을게 가득했다.,
    0003.wav|조가 자신이 그 사람이라고 나섰다.,
    ...
}

and alignment.json should look like:

{
    "./datasets/playlist name/wavs/0001.wav": "그래서 사람들도 날 핍이라고 불렀다.",
    "./datasets/playlist name/wavs/0002.wav": "크리스마스 덕분에 부엌에 먹을게 가득했다.",
    "./datasets/playlist name/wavs/0003.wav": "조가 자신이 그 사람이라고 나섰다.",
}

Project details

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

0.0.1

Apr 8, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

accp-0.0.1.tar.gz (7.3 kB view details)

Uploaded Apr 8, 2020 Source

Built Distribution

accp-0.0.1-py3-none-any.whl (9.1 kB view details)

Uploaded Apr 8, 2020 Python 3

File details

Details for the file accp-0.0.1.tar.gz.

File metadata

Download URL: accp-0.0.1.tar.gz
Upload date: Apr 8, 2020
Size: 7.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.23.0 setuptools/40.2.0 requests-toolbelt/0.9.1 tqdm/4.26.0 CPython/3.7.0

File hashes

Hashes for accp-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`b5fb30b39138d1c2af690599484500959746fd02e14cdd0b9fa18738dde4783d`
MD5	`e0d9e1eb02283de8bd3852fd0f1e2324`
BLAKE2b-256	`c8cf6b270eaeefeb074a6e972ef651fb44b33fe3edd36e46cbb46652b69a0452`

See more details on using hashes here.

File details

Details for the file accp-0.0.1-py3-none-any.whl.

File metadata

Download URL: accp-0.0.1-py3-none-any.whl
Upload date: Apr 8, 2020
Size: 9.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.23.0 setuptools/40.2.0 requests-toolbelt/0.9.1 tqdm/4.26.0 CPython/3.7.0

File hashes

Hashes for accp-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`85d6c900d9b36e77b341095d9da79474aa7bd6ff43b6d0683688a32b887743bf`
MD5	`129087f3c124486af9d90b0e9caf74e1`
BLAKE2b-256	`821e1455cadb4f549be6ba7187cf738cf26eddf62207a654d00373c811565152`

See more details on using hashes here.

accp 0.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Audio, Caption Crawler and Processor

Downloads and processes the audios and captions(subtitles) from Youtube videos for Speech AI

Requirements

To Use

Results

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes