Skip to main content

Convert youtube urls to text with speech recognition

Project description

Converts Youtube URLs to Text with Speech Recognition

project status: active supported language: english

banner

What does the library does?

  • Youtube -> Text: Translate youtube urls as text file (csv)
  • Youtube -> Audio: Downloads youtube urls as audio file (wav, flac)
  • Audio -> Text: Translate audio file (wav, flac) to text file (csv)

Three folders will be created to store the output files.

<Own Path> or <HOME_DIRECTORY>/youtube2text
│
├── audio/
│   └── 2022Jan02_011802.flac
|
├── audio-chunks/
│   └── 2022Jan02_011802
│       ├── chunk1.flac
│       ├── chunk2.flac
│       └── chunk3.flac
│   
└── text/
    └── 2022Jan02_011802.csv

How to install

Install and update using pip

pip install youtube2text

Build from source

git clone <this_repo>
cd <this_repo>
python setup.py install

How to use

  • Using the library requires internet connection for both downloading youtube videos and speech recognition operation
from youtube2text import Youtube2Text

converter = Youtube2Text()

converter.url2text(urlpath="https://www.youtube.com/watch?v=Ad9Q8rM0Am0&t=114s")

Check out more at howtouse.ipynb

Functions

  • Support audio output of
    • wav
    • flac
  • Support Automatic Speech Recognition with backend
    • Native Python backend
    • Huggingface

Youtube -> Text

def url2text(self, urlpath, outfile = None, audioformat = "flac", audiosamplingrate=16000):
    '''
    Convert youtube url to text

    Parameters:
        urlpath (str): Youtube url
        outfile (str, optional): File path/name of output file (.csv)
        audioformat (str, optional): Audioformat supported in self.__audioextension
        audiosamplingrate (int, optional): Audio sampling rate
    '''

Youtube -> Audio

def url2audio(self, urlpath, audiofile = None, audiosamplingrate=16000):
    '''
    Convert youtube url to audiofile

    Parameters:
        urlpath (str): Youtube url
        audiofile (str, optional): File path/name to save audio file
        audiosamplingrate (int, optional): Audio sampling rate
    '''

Audio -> Text

def audio2text(self, audiofile, textfile = None):
    '''
    Convert audio to csv file

    Parameters:
        audiofile (str): File path/name of audio file
        textfile (str, optional): File path/name of text file (*.csv)
    '''

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

youtube2text-0.0.9.tar.gz (29.5 kB view hashes)

Uploaded Source

Built Distribution

youtube2text-0.0.9-py3-none-any.whl (6.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page