Skip to main content

Podcast Transcript Summarisation

Project description

T5 Podcast Summariser

The aim of this project is to demonstrate the capabilities of Google's T5 model that has been fine-tuned on Spotify's Podcasts Dataset for automatic podcast summarisation. Simply pass in a podcast's transcript as input to the summariser, and it would output a summary containing the gist of what the podcast is about.

T5 and Spotify Podcast Dataset

Usage

This package relies on the HuggingFace NLP Library to work so you would have to install it too:

pip install transformers
pip install t5-podcast-summariser

You can now use the package as follows:

from t5_podcast_summariser import Summariser
summariser =  Summariser()

transcript = """
Full Transcript of the podcast.....
"""

summary = summariser.summarise(transcript)
print(summary)    

Below is a sample summary generated on a podcast transcript available here:

This week on the podcast, we talk to Ayodeji Ogunnia (@Ayodeji_Ogunnia) about his life as a bricklayer in Baltimore. We also hear from one of the most famous people in the world, and how he came to be a successful bricklayer. This is a great story for anyone who wants to learn more about what it means to be a good bricklayer. If you like what we do, please leave us a review on Apple Podcasts! Thanks for listening!

Future Work

In the next update, I will make it possible for the top-n sentences to be extracted from the transcript before a summary is generated.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

t5_podcast_summariser-0.0.5.tar.gz (2.9 kB view details)

Uploaded Source

File details

Details for the file t5_podcast_summariser-0.0.5.tar.gz.

File metadata

  • Download URL: t5_podcast_summariser-0.0.5.tar.gz
  • Upload date:
  • Size: 2.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for t5_podcast_summariser-0.0.5.tar.gz
Algorithm Hash digest
SHA256 fee1496dd23ea736dbe7fde7f6f297e21c1111bc59e80bb158698862d1028cef
MD5 0a38f59228dcd50eb1970a8546b0c4ec
BLAKE2b-256 92efa06501f94c8c5bfa6a7fe547b837fc62bf2fe50ca12c92af2c66f3b0b12f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page