Generate audio from SRT files using Microsoft Edge's text-to-speech service
Project description
cakesrt2audio
🎵 Generate synchronized audio from SRT subtitle files using Microsoft Edge's text-to-speech service.
This tool converts SRT subtitle files into audio files with perfect timing synchronization. It can also overlay the generated audio onto existing video files.
✨ Features
- 🎯 Perfect Timing: Synchronizes audio with SRT timestamps
- 🎭 Multiple Voices: Supports 100+ voices in various languages
- ⚡ Concurrent Processing: Fast generation with configurable concurrency
- 🎬 Video Support: Can overlay audio onto existing videos
- 📊 Rich Progress Display: Beautiful progress bars and voice listings
- 🔧 Flexible Output: Supports both audio (MP3) and video (MP4) output
🚀 Installation
pip install cakesrt2audio
📖 Usage
🎵 Generate Audio from SRT
Basic usage (generates output.mp3):
cakesrt2audio your_subtitles.srt
Custom voice and output file:
cakesrt2audio your_subtitles.srt --voice zh-CN-XiaoxiaoNeural --output my_audio.mp3
🎬 Generate Video with Audio Overlay
Add audio to existing video:
cakesrt2audio your_subtitles.srt --video your_video.mp4 --output final_video.mp4
⚡ Advanced Options
cakesrt2audio your_subtitles.srt \
--voice en-US-AvaMultilingualNeural \
--output output.mp3 \
--concurrency 20
📋 Command Line Parameters
| Parameter | Description | Default |
|---|---|---|
srt_file |
Path to the SRT subtitle file | Required |
--voice |
Voice ID for speech synthesis | en-US-AvaMultilingualNeural |
--output |
Output file path | output.mp3 or output.mp4 |
--video |
Source video file (optional) | None |
--concurrency |
Number of concurrent TTS requests | 10 |
🎭 Available Voices
To see all available Chinese and English voices with descriptions:
cakesrt2audio --help
Popular voice options:
en-US-AvaMultilingualNeural- English (US), Femaleen-US-BrianMultilingualNeural- English (US), Malezh-CN-XiaoxiaoNeural- Chinese (Mainland), Femalezh-CN-YunyangNeural- Chinese (Mainland), Maleen-GB-SoniaNeural- English (UK), Female
🐍 Python API Usage
Basic Audio Generation
import asyncio
from cakesrt2audio import create_audio_from_srt
# Generate audio file
asyncio.run(create_audio_from_srt(
srt_file="subtitles.srt",
voice="en-US-AvaMultilingualNeural",
output_file="output.mp3"
))
Generate Video with Audio
import asyncio
from cakesrt2audio import create_audio_from_srt
# Generate video with audio overlay
asyncio.run(create_audio_from_srt(
srt_file="subtitles.srt",
voice="zh-CN-XiaoxiaoNeural",
output_file="final_video.mp4",
video_path="source_video.mp4",
concurrency=15
))
📄 SRT File Format
Your SRT file should follow the standard format:
1
00:00:01,000 --> 00:00:03,500
Welcome to our presentation
2
00:00:04,000 --> 00:00:07,200
This is the second subtitle
3
00:00:08,000 --> 00:00:10,500
And this is the third one
🔧 Requirements
- Python 3.8+
- FFmpeg (for video processing)
- Internet connection (for Microsoft Edge TTS)
Installing FFmpeg
macOS:
brew install ffmpeg
Ubuntu/Debian:
sudo apt update
sudo apt install ffmpeg
Windows: Download from FFmpeg official website
🎯 Use Cases
- 📚 Educational Content: Convert lecture notes to audio
- 🎬 Video Production: Add voiceovers to silent videos
- 🌐 Accessibility: Create audio versions of text content
- 🎧 Podcast Creation: Generate spoken content from scripts
- 🎮 Game Development: Create character dialogue audio
⚠️ Notes
- Requires active internet connection for TTS generation
- Large SRT files may take time to process
- Adjust
--concurrencybased on your internet speed - Output timing matches SRT timestamps precisely
🐛 Troubleshooting
Common issues:
- FFmpeg not found: Install FFmpeg and ensure it's in your PATH
- TTS fails: Check internet connection and try reducing concurrency
- Audio sync issues: Verify your SRT file format is correct
📄 License
MIT License - see LICENSE file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cakesrt2audio-0.1.2.tar.gz.
File metadata
- Download URL: cakesrt2audio-0.1.2.tar.gz
- Upload date:
- Size: 9.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
22eaba7689f93a4210563170a8a1251437a085ab613d3acb358728fd4966e343
|
|
| MD5 |
5f365b6fbe37ffdd8e0bd6d4d775b5e2
|
|
| BLAKE2b-256 |
59079af674f5f73e06d0d625adb3ad9f853da31bfe8a7a64ef8d286539313363
|
File details
Details for the file cakesrt2audio-0.1.2-py3-none-any.whl.
File metadata
- Download URL: cakesrt2audio-0.1.2-py3-none-any.whl
- Upload date:
- Size: 9.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b8a0d839381ebbe9606c98bab4902484d82aceb840afb90e81ff74b40ca90503
|
|
| MD5 |
59fd81c4d25b693b34060f946600e3cb
|
|
| BLAKE2b-256 |
282837574553774aa651e8781b8e22d77958bc95ff863fcf93c4208e0c62733b
|