Skip to main content

Convert youtube urls to text with speech recognition

Project description

Converts Youtube URLs to Text with Speech Recognition

project status: active supported language: english

banner

What does the library does?

  • Youtube -> Text: Translate youtube urls as text file (csv)
  • Youtube -> Audio: Downloads youtube urls as audio file (wav, flac)
  • Audio -> Text: Translate audio file (wav, flac) to text file (csv)

Three folders will be created to store the output files.

<Own Path> or <HOME_DIRECTORY>/youtube2text
│
├── audio/
│   └── 2022Jan02_011802.flac
|
├── audio-chunks/
│   └── 2022Jan02_011802
│       ├── chunk1.flac
│       ├── chunk2.flac
│       └── chunk3.flac
│   
└── text/
    └── 2022Jan02_011802.csv

How to install

Install and update using pip

pip install youtube2text

Build from source

git clone <this_repo>
cd <this_repo>
python setup.py install

How to use

  • Using the library requires internet connection for both downloading youtube videos and speech recognition operation
from youtube2text import Youtube2Text

converter = Youtube2Text()

converter.url2text(urlpath="https://www.youtube.com/watch?v=Ad9Q8rM0Am0&t=114s")

Check out more at howtouse.ipynb

Functions

  • Support audio output of
    • wav
    • flac
  • Support Automatic Speech Recognition with backend
    • Native Python backend
    • Huggingface

Youtube -> Text

def url2text(self, urlpath, outfile = None, audioformat = "flac", audiosamplingrate=16000):
    '''
    Convert youtube url to text

    Parameters:
        urlpath (str): Youtube url
        outfile (str, optional): File path/name of output file (.csv)
        audioformat (str, optional): Audioformat supported in self.__audioextension
        audiosamplingrate (int, optional): Audio sampling rate
    '''

Youtube -> Audio

def url2audio(self, urlpath, audiofile = None, audiosamplingrate=16000):
    '''
    Convert youtube url to audiofile

    Parameters:
        urlpath (str): Youtube url
        audiofile (str, optional): File path/name to save audio file
        audiosamplingrate (int, optional): Audio sampling rate
    '''

Audio -> Text

def audio2text(self, audiofile, textfile = None):
    '''
    Convert audio to csv file

    Parameters:
        audiofile (str): File path/name of audio file
        textfile (str, optional): File path/name of text file (*.csv)
    '''

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

youtube2text-0.0.9.tar.gz (29.5 kB view details)

Uploaded Source

Built Distribution

youtube2text-0.0.9-py3-none-any.whl (6.9 kB view details)

Uploaded Python 3

File details

Details for the file youtube2text-0.0.9.tar.gz.

File metadata

  • Download URL: youtube2text-0.0.9.tar.gz
  • Upload date:
  • Size: 29.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.1 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.7

File hashes

Hashes for youtube2text-0.0.9.tar.gz
Algorithm Hash digest
SHA256 9f935dfa76298fbecfb18292b793b538b5c85c09c3d3e617cc594673ebd1f5fa
MD5 cca623ef7702c32c1a9fc62b8c96861b
BLAKE2b-256 de8c46be5ccbd6f3c8140487f4cb808eed753a9ad278b5ba51b38bf6a9d4e21e

See more details on using hashes here.

File details

Details for the file youtube2text-0.0.9-py3-none-any.whl.

File metadata

  • Download URL: youtube2text-0.0.9-py3-none-any.whl
  • Upload date:
  • Size: 6.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.1 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.7

File hashes

Hashes for youtube2text-0.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 4202aeb0e32fa460380071022f2dd7bf308b58003791a41b9bf1de7f42c0ef14
MD5 91da92875bdd1074ded4396eccca7d4b
BLAKE2b-256 53bdf31adbb6c7b472b900e554ea69c79afd05e13be22bee7dd45289fe15dfd9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page