Skip to main content

A simple Youtube video transcriber and summarizer

Project description

#Audiutor

This project has been developed for the Computer Programming course at the University of Bolzano, Ed Faculty. It allows you to automatically transcribe and summarize a Youtube video in English using Watson IBM speech to text API.

Youtube is full of interesting lectures and educational videos, but not everyone has time to watch them. This little program was developed with one goal in mind -- reduce time and allow students and working people to retrieve the information contained in such educational videos, faster. At the moment, the program is available for English videos only and it works with larger files as well.

Before running this program you will need to create your own Watson Speech to text API key and URL, it will literally take 2 minutes of your time. You can do so by (last update June 2021):

  1. Registering for free here: https://cloud.ibm.com/registration

  2. Creating your Watson Speech to text API key and URL. Just hit the CREATE button (in the right-hand corner below) on this page: https://cloud.ibm.com/catalog/services/speech-to-text'

Call the function:

  • audiutor() to run the program
  • transcript_wordcloud() to create a wordcloud
  • kws_extraction() to extract keywords

In order for the program to function correctly, please do not change the filenames.

Used dependencies:

from pytube import YouTube import os import moviepy.editor as mp from ibm_watson import SpeechToTextV1 from ibm_watson.websocket import RecognizeCallback, AudioSource from ibm_cloud_sdk_core.authenticators import IAMAuthenticator import subprocess import nltk from nltk.corpus import stopwords from nltk.tokenize import word_tokenize nltk.download('stopwords') nltk.download('punkt') from wordcloud import WordCloud from string import punctuation import re import string from wordcloud import WordCloud import matplotlib.pyplot as plt from gensim.summarization import summarize from random import randrange from time import sleep import pprint import spacy import textacy.ke from textacy import * from docx import Document import sys

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Audiutor-PatBe-0.0.5.tar.gz (3.0 kB view details)

Uploaded Source

Built Distribution

Audiutor_PatBe-0.0.5-py3-none-any.whl (3.2 kB view details)

Uploaded Python 3

File details

Details for the file Audiutor-PatBe-0.0.5.tar.gz.

File metadata

  • Download URL: Audiutor-PatBe-0.0.5.tar.gz
  • Upload date:
  • Size: 3.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.23.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.10

File hashes

Hashes for Audiutor-PatBe-0.0.5.tar.gz
Algorithm Hash digest
SHA256 80a9a251ba7dc3309f832bbeaf9887b6ea72a4181bda9bc902d3597334dc1902
MD5 ddfbd306597a3a46421ff9479afab368
BLAKE2b-256 22bd81c797ad46cf0610cfba516fc597446e169dd2a526ae581ba47c508aba01

See more details on using hashes here.

File details

Details for the file Audiutor_PatBe-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: Audiutor_PatBe-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 3.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.23.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.10

File hashes

Hashes for Audiutor_PatBe-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 d681742cb557aebb9363ea3911f7df8058ca1a7d449777ea1c8d9ac94aaa1fb9
MD5 0c495066c2f1915ad736da0da7741ce1
BLAKE2b-256 51873ac9360ba7aee880ebe63d75fa22006652967346c4223aa3f223433cd003

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page