Skip to main content

An Open Source alternative to NotebookLM's podcast feature: Transforming Multimodal Content into Captivating Multilingual Audio Conversations with GenAI

Project description

Podcastfy.ai 🎙️🤖

An Open Source API alternative to NotebookLM's podcast feature: Transforming Multimodal Content into Captivating Multilingual Audio Conversations with GenAI

https://github.com/user-attachments/assets/5d42c106-aabe-44c1-8498-e9c53545ba40

Paper | Python Package | CLI | REST API | Web App | Feedback

Open In Colab PyPi Status PyPI Downloads Issues Pytest Docker Documentation Status License GitHub Repo stars

Podcastfy is an open-source Python package that transforms multi-modal content (text, images) into engaging, multi-lingual audio conversations using GenAI. Input content includes websites, PDFs, images, YouTube videos, as well as user provided topics.

Unlike closed-source UI-based tools focused primarily on research synthesis (e.g. NotebookLM ❤️), Podcastfy focuses on open source, programmatic and bespoke generation of engaging, conversational content from a multitude of multi-modal sources, enabling customization and scale.

Star History Chart

Audio Examples 🔊

This sample collection was generated using this Python Notebook.

Images

Sample 1: Senecio, 1922 (Paul Klee) and Connection of Civilizations (2017) by Gheorghe Virtosu


Senecio, 1922 (Paul Klee) Connection of Civilizations (2017) by Gheorghe Virtosu


Sample 2: The Great Wave off Kanagawa, 1831 (Hokusai) and Takiyasha the Witch and the Skeleton Spectre, c. 1844 (Kuniyoshi)


The Great Wave off Kanagawa, 1831 (Hokusai) Takiyasha the Witch and the Skeleton Spectre, c. 1844 (Kuniyoshi)


Sample 3: Pop culture icon Taylor Swift and Mona Lisa, 1503 (Leonardo da Vinci)


Taylor Swift Mona Lisa

Text

Audio Description Source
Person Website Website
Audio (longform=True) Lex Fridman Podcast: Dario Amodei Anthropic's CEO Youtube
Audio (longform=True) Benjamin Franklin's Autobiography Book

Multi-Lingual Text

Language Content Type Description Audio Source
French Website Agroclimate research information Audio Website
Portuguese-BR News Article Election polls in São Paulo Audio Website

Features ✨

  • Generate conversational content from multiple sources and formats (images, text, websites, YouTube, and PDFs).
  • Generate shorts (2-5 minutes) or longform (30+ minutes) podcasts.
  • Customize transcript and audio generation (e.g., style, language, structure).
  • Generate transcripts using 100+ LLM models (OpenAI, Anthropic, Google etc).
  • Leverage local LLMs for transcript generation for increased privacy and control.
  • Integrate with advanced text-to-speech models (OpenAI, Google, ElevenLabs, and Microsoft Edge).
  • Provide multi-language support for global content creation.
  • Integrate seamlessly with CLI and Python packages for automated workflows.

Built with Podcastfy 🚀

Updates 🚀🚀

v0.4.0+ release

  • Released new Multi-Speaker TTS model (is it the one NotebookLM uses?!?)
  • Generate short or longform podcasts
  • Generate podcasts from input topic using grounded real-time web search
  • Integrate with 100+ LLM models (OpenAI, Anthropic, Google etc) for transcript generation

See CHANGELOG for more details.

Quickstart 💻

Prerequisites

  • Python 3.11 or higher
  • $ pip install ffmpeg (for audio processing)

Setup

  1. Install from PyPI $ pip install podcastfy

  2. Set up your API keys

Python

from podcastfy.client import generate_podcast

audio_file = generate_podcast(urls=["<url1>", "<url2>"])

CLI

python -m podcastfy.client --url <url1> --url <url2>

Usage 💻

Experience Podcastfy with our HuggingFace 🤗 Spaces app. (Note: This UI app is less extensively tested than the Python package.)

Customization 🔧

Podcastfy offers a range of customization options to tailor your AI-generated podcasts:

License

This software is licensed under Apache 2.0. See instructions if you would like to use podcastfy in your software.

Contributing 🤝

We welcome contributions! See Guidelines for more details.

Example Use Cases 🎧🎶

  • Content Creators can use Podcastfy to convert blog posts, articles, or multimedia content into podcast-style audio, enabling them to reach broader audiences. By transforming content into an audio format, creators can cater to users who prefer listening over reading.

  • Educators can transform lecture notes, presentations, and visual materials into audio conversations, making educational content more accessible to students with different learning preferences. This is particularly beneficial for students with visual impairments or those who have difficulty processing written information.

  • Researchers can convert research papers, visual data, and technical content into conversational audio. This makes it easier for a wider audience, including those with disabilities, to consume and understand complex scientific information. Researchers can also create audio summaries of their work to enhance accessibility.

  • Accessibility Advocates can use Podcastfy to promote digital accessibility by providing a tool that converts multimodal content into auditory formats. This helps individuals with visual impairments, dyslexia, or other disabilities that make it challenging to consume written or visual content.

Contributors

contributors

↑ Back to Top ↑

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

podcastfy-0.4.1.tar.gz (932.9 kB view details)

Uploaded Source

Built Distribution

podcastfy-0.4.1-py3-none-any.whl (937.9 kB view details)

Uploaded Python 3

File details

Details for the file podcastfy-0.4.1.tar.gz.

File metadata

  • Download URL: podcastfy-0.4.1.tar.gz
  • Upload date:
  • Size: 932.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.10 Linux/6.8.0-48-generic

File hashes

Hashes for podcastfy-0.4.1.tar.gz
Algorithm Hash digest
SHA256 1800be6b19026cf970e209ad9e7cbd3c2599d37b82cc6fbf3b6ef41e3f85a169
MD5 5e404ab3f8338cb7393ea954b9a2e178
BLAKE2b-256 8f7ff4bf76dc4db77bafd427a7d3ac69c6541ca044f9916118169b16ef0acff7

See more details on using hashes here.

File details

Details for the file podcastfy-0.4.1-py3-none-any.whl.

File metadata

  • Download URL: podcastfy-0.4.1-py3-none-any.whl
  • Upload date:
  • Size: 937.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.10 Linux/6.8.0-48-generic

File hashes

Hashes for podcastfy-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1a4b5544a2b4dbd871ac06fbb51567c5828aa7805c1a45f13ae5bad2e3cfd9f2
MD5 4f5fa31cfbeed190b69a0c1263da9429
BLAKE2b-256 af55c66aab5fa7dbc56f036de26d6d85235b819c161403164b7f60779cc4f328

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page