Skip to main content

Transforming Multimodal Content into Captivating Multilingual Audio Conversations with GenAI

Project description

Podcastfy.ai 🎙️🤖

PyPi Status Downloads Issues License: CC BY-NC-SA 4.0 GitHub Repo stars

Transforming Multimodal Content into Captivating Multilingual Audio Conversations with GenAI

https://github.com/user-attachments/assets/f1559e70-9cf9-4576-b48b-87e7dad1dd0b

Podcastfy is an open-source Python package that transforms multi-modal content (text, images) into engaging, multi-lingual audio conversations using GenAI. Input content include websites, PDFs, youtube videos as well as images.

Unlike UI-based tools focused primarily on note-taking or research synthesis (e.g. NotebookLM ❤️), Podcastfy focuses on the programmatic and bespoke generation of engaging, conversational transcripts and audio from a multitude of multi-modal sources enabling customization and scale.

Audio Examples 🔊

This sample collection is also available at audio.com.

Images

Image Set Description Audio
Alt text Alt text Senecio, 1922 (Paul Klee) and Connection of Civilizations (2017) by Gheorghe Virtosu 🔊
Alt text Alt text The Great Wave off Kanagawa, 1831 (Hokusai) and Takiyasha the Witch and the Skeleton Spectre, c. 1844 (Kuniyoshi) 🔊
Alt text Alt text Pop culture icon Taylor Swift and Mona Lisa, 1503 (Leonardo da Vinci) 🔊

Text

Content Type Description Audio Source
Youtube Video YCombinator on LLMs Audio YouTube
PDF Book: Networks, Crowds, and Markets Audio book pdf
Research Paper Climate Change in France Audio PDF
Website My Personal Website Audio Website
Website + YouTube My Personal Website + YouTube Video on AI Audio Website, YouTube

Multi-Lingual Text

Language Content Type Description Audio Source
French Website Agroclimate research information Audio Website
Portuguese-BR News Article Election polls in São Paulo Audio Website

Features ✨

  • Generate AI-powered conversational content from multi-sources and formats (images, websites, YouTube, and PDFs)
  • Customizable transcript and audio generation (e.g. style, language, structure, length)
  • Create podcasts from pre-existing or edited transcripts
  • Support for advanced text-to-speech models (OpenAI and ElevenLabs)
  • Seamless CLI and Python package integration for automated workflows
  • Multi-language support for global content creation (experimental!)

Updates 🚀

v0.2.1 release

  • Podcastfy is now multi-modal! Users can now generate audio from images.

v0.2.0 release

  • Users can now customize podcast style, structure, and content
  • Integration with LangChain for better LLM management
  • and more...

Quickstart 💻

Prerequisites

  • Python 3.11 or higher
  • $ pip install ffmpeg (for audio processing)

Installation

  1. Install from PyPI $ pip install podcastfy

  2. Set up your API keys

Python

from podcastfy.client import generate_podcast

audio_file = generate_podcast(urls=["<url1>", "<url2>"])

CLI

python -m podcastfy.client --url <url1> --url <url2>

Usage 💻

Experience Podcastfy with our HuggingFace 🤗 Spaces app for a simple URL-to-Audio demo. (Note: This UI app is less extensively tested than the Python package.)

Customization 🔧

Podcastfy offers a range of Conversation Customization options to tailor your AI-generated podcasts. Whether you're creating educational content, storytelling experiences, or anything in between, these configuration options allow you to fine-tune your podcast's tone, length, and format.

Contributing 🤝

We welcome contributions! Please submit Issues or Pull Requests. Feel free to fork the repo and create your own applications. We're excited to learn about your use cases!

Example Use Cases 🎧🎶

  1. Content Summarization: Busy professionals can stay informed on industry trends by listening to concise audio summaries of multiple articles, saving time and gaining knowledge efficiently.

  2. Language Localization: Non-native English speakers can access English content in their preferred language, breaking down language barriers and expanding access to global information.

  3. Website Content Marketing: Companies can increase engagement by repurposing written website content into audio format, providing visitors with the option to read or listen.

  4. Personal Branding: Job seekers can create unique audio-based personal presentations from their CV or LinkedIn profile, making a memorable impression on potential employers.

  5. Research Paper Summaries: Graduate students and researchers can quickly review multiple academic papers by listening to concise audio summaries, speeding up the research process.

  6. Long-form Podcast Summarization: Podcast enthusiasts with limited time can stay updated on their favorite shows by listening to condensed versions of lengthy episodes.

  7. News Briefings: Commuters can stay informed about daily news during travel time with personalized audio news briefings compiled from their preferred sources.

  8. Educational Content Creation: Educators can enhance learning accessibility by providing audio versions of course materials, catering to students with different learning preferences.

  9. Book Summaries: Avid readers can preview books efficiently through audio summaries, helping them make informed decisions about which books to read in full.

  10. Conference and Event Recaps: Professionals can stay updated on important industry events they couldn't attend by listening to audio recaps of conference highlights and key takeaways.

License

This project is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Disclaimer

This tool is designed for personal or educational use. Please ensure you have the necessary rights or permissions before using content from external sources for podcast creation. All audio content is AI-generated and it is not intended to clone real-life humans!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

podcastfy-0.2.1.tar.gz (21.7 kB view details)

Uploaded Source

Built Distribution

podcastfy-0.2.1-py3-none-any.whl (24.9 kB view details)

Uploaded Python 3

File details

Details for the file podcastfy-0.2.1.tar.gz.

File metadata

  • Download URL: podcastfy-0.2.1.tar.gz
  • Upload date:
  • Size: 21.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.10 Linux/6.8.0-45-generic

File hashes

Hashes for podcastfy-0.2.1.tar.gz
Algorithm Hash digest
SHA256 bb1027bdf6b97af48abf577b2534343929ab9d471d0e549594f2dc6febb0635c
MD5 65544f99c39ce92ae7a1e377acec376f
BLAKE2b-256 8f210b55550b5cc3759f8d29f287e4ad7f1bda46e826919b8e31ae8755e244f8

See more details on using hashes here.

File details

Details for the file podcastfy-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: podcastfy-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 24.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.10 Linux/6.8.0-45-generic

File hashes

Hashes for podcastfy-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 56b9b53b8e55ab5723e78885267ed1e0d263c1ff0aed971c11fb4d7ae9a52697
MD5 370709e78345dfb61250ab1355091efd
BLAKE2b-256 d328f29506e20c53224cbe54a5679882cc3f127825943f0afa2a3a812efe7f04

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page