An Open Source alternative to NotebookLM's podcast feature: Transforming Multimodal Content into Captivating Multilingual Audio Conversations with GenAI

These details have not been verified by PyPI

Project description

Podcastfy.ai 🎙️🤖

GitHub Repo stars

An Open Source alternative to NotebookLM's podcast feature: Transforming Multimodal Content into Captivating Multilingual Audio Conversations with GenAI

https://github.com/user-attachments/assets/f1559e70-9cf9-4576-b48b-87e7dad1dd0b

Podcastfy is an open-source Python package that transforms multi-modal content (text, images) into engaging, multi-lingual audio conversations using GenAI. Input content includes websites, PDFs, YouTube videos, as well as images.

Unlike UI-based tools focused primarily on note-taking or research synthesis (e.g. NotebookLM ❤️), Podcastfy focuses on the programmatic and bespoke generation of engaging, conversational transcripts and audio from a multitude of multi-modal sources, enabling customization and scale.

Podcastfy is available as a Python package, CLI, REST API and Web App.

Audio Examples 🔊

This sample collection is also available at audio.com.

Images

Image Set	Description	Audio
	Senecio, 1922 (Paul Klee) and Connection of Civilizations (2017) by Gheorghe Virtosu	🔊
	The Great Wave off Kanagawa, 1831 (Hokusai) and Takiyasha the Witch and the Skeleton Spectre, c. 1844 (Kuniyoshi)	🔊
	Pop culture icon Taylor Swift and Mona Lisa, 1503 (Leonardo da Vinci)	🔊

Text

Content Type	Description	Audio	Source
Youtube Video	YCombinator on LLMs	Audio	YouTube
PDF	Book: Networks, Crowds, and Markets	Audio	book pdf
Research Paper	Climate Change in France	Audio	PDF
Website	My Personal Website	Audio	Website
Website + YouTube	My Personal Website + YouTube Video on AI	Audio	Website, YouTube

Multi-Lingual Text

Language	Content Type	Description	Audio	Source
French	Website	Agroclimate research information	Audio	Website
Portuguese-BR	News Article	Election polls in São Paulo	Audio	Website

Features ✨

Generate conversational content from multiple sources and formats (images, websites, YouTube, and PDFs)
Customize transcript and audio generation (e.g. style, language, structure, length)
Create podcasts from pre-existing or edited transcripts
Support for advanced text-to-speech models (OpenAI, ElevenLabs and Edge)
Support for running local llms for transcript generation (increased privacy and control)
Seamless CLI and Python package integration for automated workflows
Multi-language support for global content creation (experimental!)

Updates 🚀

v0.2.3 release

Add support for running LLMs locally
Enable config for running podcastfy with no API KEYs
and more...

v0.2.2 release

Podcastfy is now multi-modal! Users can generate audio from images + text inputs!

v0.2.0 release

Users can now customize podcast style, structure, and content
Integration with LangChain for better LLM management

Quickstart 💻

Prerequisites

Python 3.11 or higher
$ pip install ffmpeg (for audio processing)

Setup

Install from PyPI $ pip install podcastfy
Set up your API keys

Python

from podcastfy.client import generate_podcast

audio_file = generate_podcast(urls=["<url1>", "<url2>"])

CLI

python -m podcastfy.client --url <url1> --url <url2>

Usage 💻

Experience Podcastfy with our HuggingFace 🤗 Spaces app. (Note: This UI app is less extensively tested than the Python package.)

Customization 🔧

Podcastfy offers a range of customization options to tailor your AI-generated podcasts:

Customize podcast conversation (e.g. format, style, voices)
Choose to run Local LLMs (156+ HuggingFace models)
Set System Settings (e.g. output directory settings)

License

This software is licensed under Apache 2.0. Here are a few instructions if you would like to use podcastfy in your software.

Contributing 🤝

We welcome contributions! See Guidelines for more details.

Example Use Cases 🎧🎶

Content Summarization: Busy professionals can stay informed on industry trends by listening to concise audio summaries of multiple articles, saving time and gaining knowledge efficiently.
Language Localization: Non-native English speakers can access English content in their preferred language, breaking down language barriers and expanding access to global information.
Website Content Marketing: Companies can increase engagement by repurposing written website content into audio format, providing visitors with the option to read or listen.
Personal Branding: Job seekers can create unique audio-based personal presentations from their CV or LinkedIn profile, making a memorable impression on potential employers.
Research Paper Summaries: Graduate students and researchers can quickly review multiple academic papers by listening to concise audio summaries, speeding up the research process.
Long-form Podcast Summarization: Podcast enthusiasts with limited time can stay updated on their favorite shows by listening to condensed versions of lengthy episodes.
News Briefings: Commuters can stay informed about daily news during travel time with personalized audio news briefings compiled from their preferred sources.
Educational Content Creation: Educators can enhance learning accessibility by providing audio versions of course materials, catering to students with different learning preferences.
Book Summaries: Avid readers can preview books efficiently through audio summaries, helping them make informed decisions about which books to read in full.
Conference and Event Recaps: Professionals can stay updated on important industry events they couldn't attend by listening to audio recaps of conference highlights and key takeaways.

Contributors

Disclaimer

This tool is designed for personal or educational use. Please ensure you have the necessary rights or permissions before using content from external sources for podcast creation. All audio content is AI-generated and it is not intended to clone real-life humans!

↑ Back to Top ↑

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.4.1

Nov 16, 2024

0.4.0

Nov 16, 2024

0.3.6

Nov 13, 2024

0.3.5

Nov 8, 2024

0.3.4

Nov 8, 2024

0.3.3

Nov 8, 2024

0.3.2

Nov 7, 2024

0.3.1

Nov 7, 2024

0.3.0

Nov 6, 2024

0.2.19

Nov 6, 2024

0.2.18

Oct 31, 2024

This version

0.2.17

Oct 31, 2024

0.2.16

Oct 31, 2024

0.2.15

Oct 27, 2024

0.2.14

Oct 27, 2024

0.2.13

Oct 27, 2024

0.2.12

Oct 27, 2024

0.2.11

Oct 26, 2024

0.2.10

Oct 25, 2024

0.2.9

Oct 25, 2024

0.2.8

Oct 25, 2024

0.2.7

Oct 24, 2024

0.2.6

Oct 16, 2024

0.2.5

Oct 16, 2024

0.2.3

Oct 15, 2024

0.2.2

Oct 13, 2024

0.2.1

Oct 12, 2024

0.2.0

Oct 10, 2024

0.1.13

Oct 8, 2024

0.1.12

Oct 7, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

podcastfy-0.2.17.tar.gz (32.4 kB view details)

Uploaded Oct 31, 2024 Source

Built Distribution

podcastfy-0.2.17-py3-none-any.whl (37.0 kB view details)

Uploaded Oct 31, 2024 Python 3

File details

Details for the file podcastfy-0.2.17.tar.gz.

File metadata

Download URL: podcastfy-0.2.17.tar.gz
Upload date: Oct 31, 2024
Size: 32.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.3 CPython/3.11.10 Linux/6.8.0-47-generic

File hashes

Hashes for podcastfy-0.2.17.tar.gz
Algorithm	Hash digest
SHA256	`bbf74b760f2b0e3569bb83911c7da82bf84dab53af3d90f71617fe94de051716`
MD5	`83f42456aaf8492b6f77fb2c4fa2a4da`
BLAKE2b-256	`01dbbcfbcc2ae6e23d9301fad33f855a3626528420e1f6d19263133142f97190`

See more details on using hashes here.

File details

Details for the file podcastfy-0.2.17-py3-none-any.whl.

File metadata

Download URL: podcastfy-0.2.17-py3-none-any.whl
Upload date: Oct 31, 2024
Size: 37.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.3 CPython/3.11.10 Linux/6.8.0-47-generic

File hashes

Hashes for podcastfy-0.2.17-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b96e42de891e5254e0fdadfab083790e46583e62784b972656a4037eb546bc57`
MD5	`2a867f7b08da3ba2c6d4b071a247edc7`
BLAKE2b-256	`5365f5dc16b3e41d41ba40b47bf1fb26e245c8b084565572919a5d12107d039e`