AI-powered tool to convert blog articles into podcast audio with optional voice cloning
Project description
๐๏ธ Blog2Podcasts
An AI-powered tool that converts any blog article into a podcast audio file with optional voice cloning from YouTube.
Features
- ๐ Web Scraping: Extracts main content from any blog URL using trafilatura
- ๐ค AI Summarization: Converts articles into engaging podcast scripts using local LLMs (Ollama)
- ๐ต Text-to-Speech: Generates high-quality audio using Microsoft Edge TTS (free)
- ๐ค Voice Cloning: Clone voices from YouTube videos using Coqui TTS (XTTS-v2)
Installation
From PyPI
pip install blog2podcasts
From Source
git clone https://github.com/QuantBender/blog2podcasts.git
cd blog2podcasts
pip install -e .
With Voice Cloning Support
pip install blog2podcasts[voice-cloning]
Prerequisites
1. Install Ollama
# Linux
curl -fsSL https://ollama.ai/install.sh | sh
# macOS
brew install ollama
# Start Ollama service
ollama serve
2. Pull an LLM Model
# Recommended: Llama 3.2 (fast and capable)
ollama pull llama3.2
# Alternative options:
ollama pull mistral # Fast, good quality
ollama pull llama3.1 # More capable, slower
ollama pull phi3 # Small, fast
3. Install ffmpeg (for audio processing)
# Ubuntu/Debian
sudo apt install ffmpeg
# macOS
brew install ffmpeg
# Windows
# Download from https://ffmpeg.org/download.html
Usage
Command Line
# Convert a blog to podcast
blog2podcasts https://example.com/blog-article
# Use a different voice
blog2podcasts https://example.com/blog --voice en-GB-RyanNeural
# Use a different LLM model
blog2podcasts https://example.com/blog --model mistral
# Adjust script length (words)
blog2podcasts https://example.com/blog --length 1200
# Preview script without generating audio
blog2podcasts https://example.com/blog --preview
# List available voices
blog2podcasts --list-voices
# Custom output name
blog2podcasts https://example.com/blog -o my_podcast
# Adjust speech rate
blog2podcasts https://example.com/blog --rate "+10%"
๐ค Voice Cloning from YouTube
Clone any voice from YouTube videos and use it for your podcasts!
# Clone voice from a YouTube video
blog2podcasts --clone-voice "https://www.youtube.com/watch?v=VIDEO_ID" --voice-name "my_host"
# Generate podcast with cloned voice
blog2podcasts https://example.com/blog --use-cloned-voice my_host
Python API
from blog2podcasts import BlogScraper, ContentSummarizer, AudioGenerator
from blog2podcasts.cli import BlogToPodcastAgent, PodcastConfig
# Create agent with custom config
config = PodcastConfig(
voice="en-US-JennyNeural", # Female US voice
model="llama3.2", # Ollama model
script_length=1000, # Target words
output_dir="podcasts", # Output folder
)
agent = BlogToPodcastAgent(config)
# Convert blog to podcast
result = agent.convert("https://example.com/interesting-article")
print(f"Audio: {result['audio_path']}")
print(f"Script: {result['script_path']}")
Use Individual Components
from blog2podcasts import BlogScraper, ContentSummarizer, AudioGenerator
import asyncio
# Just scrape a blog
scraper = BlogScraper()
content = scraper.scrape("https://example.com/blog")
print(content.title, content.text)
# Just create a podcast script
summarizer = ContentSummarizer(model="llama3.2")
script = summarizer.generate_podcast_script(content.text, content.title)
# Just generate audio
generator = AudioGenerator(voice="en-US-GuyNeural")
asyncio.run(generator.generate_audio(script, "output.mp3"))
Available Voices
Recommended Podcast Voices
| Voice | ID | Style |
|---|---|---|
| ๐บ๐ธ Guy (Male) | en-US-GuyNeural |
Professional, clear |
| ๐บ๐ธ Jenny (Female) | en-US-JennyNeural |
Friendly, warm |
| ๐ฌ๐ง Ryan (Male) | en-GB-RyanNeural |
British, authoritative |
| ๐ฌ๐ง Sonia (Female) | en-GB-SoniaNeural |
British, professional |
| ๐ฆ๐บ William (Male) | en-AU-WilliamNeural |
Australian, casual |
| ๐ฆ๐บ Natasha (Female) | en-AU-NatashaNeural |
Australian, friendly |
Run blog2podcasts --list-voices to see all available voices.
Tech Stack
| Component | Tool | Why |
|---|---|---|
| Scraping | Trafilatura | Best-in-class article extraction |
| LLM | Ollama | Free, local, private LLM inference |
| TTS | Edge-TTS | High-quality, free Microsoft voices |
| Voice Cloning | Coqui TTS | Open-source XTTS-v2 voice cloning |
| YouTube Download | yt-dlp | Extract audio from YouTube videos |
Project Structure
blog2podcasts/
โโโ pyproject.toml # Package configuration
โโโ LICENSE # MIT License
โโโ README.md # This file
โโโ CHANGELOG.md # Version history
โโโ blog2podcasts/
โ โโโ __init__.py # Package exports
โ โโโ cli.py # Command-line interface
โ โโโ scraper.py # Blog content extraction
โ โโโ summarizer.py # LLM-based script generation
โ โโโ audio_generator.py # Text-to-speech (Edge TTS)
โ โโโ voice_cloner.py # YouTube voice extraction & cloning
โโโ voices/ # Saved voice profiles
โโโ output/ # Generated podcasts
How It Works
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ
โ Blog URL โ -> โ Scraper โ -> โ Summarizer โ -> โ Edge TTS โ
โ โ โ (trafilatura)โ โ (Ollama) โ โ โ
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ
โ โ โ
v v v
Blog Content Podcast Script Audio (.mp3)
Troubleshooting
"Ollama not available"
# Start Ollama service
ollama serve
# Check if running
curl http://localhost:11434/api/tags
"Model not found"
# Pull the model
ollama pull llama3.2
# List available models
ollama list
"Content extraction failed"
- Some sites block scraping - try a different blog
- Check if the URL is accessible
- The fallback scraper will try BeautifulSoup
License
MIT License - Use freely for personal and commercial projects.
Contributing
Pull requests welcome! See CONTRIBUTING.md for guidelines.
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file blog2podcasts-1.0.0.tar.gz.
File metadata
- Download URL: blog2podcasts-1.0.0.tar.gz
- Upload date:
- Size: 20.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7ed1cd12cff727a675adfe86f24d5a577de57f5c6b1d49997db3d9f986703f0b
|
|
| MD5 |
98f3be2cba2de53d59f5ca06993a0b1e
|
|
| BLAKE2b-256 |
c126882527163e6b4f96febddabf0e91609b6ca3c317b450e5f84b3e113b7cbd
|
File details
Details for the file blog2podcasts-1.0.0-py3-none-any.whl.
File metadata
- Download URL: blog2podcasts-1.0.0-py3-none-any.whl
- Upload date:
- Size: 20.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d5ff4dd8e4f24a14e7bba37de4ed89d024b82e2c80f8c8292948fc890a806eb3
|
|
| MD5 |
894dfba7585241f378dad9eec7a339e6
|
|
| BLAKE2b-256 |
e2a6a5370f69fa606bcdaedee3067b8abab9a3794c4066c38324085c125e8a82
|