A simple Youtube video transcriber and summarizer
Project description
#Audiutor
This project has been developed for the Computer Programming course at the University of Bolzano, Ed Faculty. It allows you to automatically transcribe and summarize a Youtube video in English using Watson IBM speech to text API.
Youtube is full of interesting lectures and educational videos, but not everyone has time to watch them. This little program was developed with one goal in mind -- reduce time and allow students and working people to retrieve the information contained in such educational videos, faster. At the moment, the program is available for English videos only and it works with larger files as well.
Before running this program you will need to create your own Watson Speech to text API key and URL, it will literally take 2 minutes of your time. You can do so by (last update June 2021):
-
Registering for free here: https://cloud.ibm.com/registration
-
Creating your Watson Speech to text API key and URL. Just hit the CREATE button (in the right-hand corner below) on this page: https://cloud.ibm.com/catalog/services/speech-to-text'
Call the function:
- audiutor() to run the program
- transcript_wordcloud() to create a wordcloud
- kws_extraction() to extract keywords
In order for the program to function correctly, please do not change the filenames.
Used dependencies:
from pytube import YouTube import os import moviepy.editor as mp from ibm_watson import SpeechToTextV1 from ibm_watson.websocket import RecognizeCallback, AudioSource from ibm_cloud_sdk_core.authenticators import IAMAuthenticator import subprocess import nltk from nltk.corpus import stopwords from nltk.tokenize import word_tokenize nltk.download('stopwords') nltk.download('punkt') from wordcloud import WordCloud from string import punctuation import re import string from wordcloud import WordCloud import matplotlib.pyplot as plt from gensim.summarization import summarize from random import randrange from time import sleep import pprint import spacy import textacy.ke from textacy import * from docx import Document import sys
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file Audiutor-PatBe-0.0.5.tar.gz
.
File metadata
- Download URL: Audiutor-PatBe-0.0.5.tar.gz
- Upload date:
- Size: 3.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.23.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 80a9a251ba7dc3309f832bbeaf9887b6ea72a4181bda9bc902d3597334dc1902 |
|
MD5 | ddfbd306597a3a46421ff9479afab368 |
|
BLAKE2b-256 | 22bd81c797ad46cf0610cfba516fc597446e169dd2a526ae581ba47c508aba01 |
File details
Details for the file Audiutor_PatBe-0.0.5-py3-none-any.whl
.
File metadata
- Download URL: Audiutor_PatBe-0.0.5-py3-none-any.whl
- Upload date:
- Size: 3.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.23.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d681742cb557aebb9363ea3911f7df8058ca1a7d449777ea1c8d9ac94aaa1fb9 |
|
MD5 | 0c495066c2f1915ad736da0da7741ce1 |
|
BLAKE2b-256 | 51873ac9360ba7aee880ebe63d75fa22006652967346c4223aa3f223433cd003 |