Unlimited text-to-speech generation with chunking and seamless merging
Project description
🎙️ AudioMaker
AudioMaker is a Python package for generating seamless, long-form audio from massive text inputs.
Unlike traditional TTS tools, AudioMaker can handle book-length content (even 4+ hours) by splitting text into chunks, synthesizing each chunk, and merging them into a single audio file.
✨ Features
- 📚 Handles huge text – turn entire books into one MP3
- 🧩 Chunking system – bypasses TTS length limits automatically
- 🔗 Seamless merging – no awkward pauses or breaks
- 🎙️ Custom voices – choose from Microsoft Edge-TTS voices
- 🛠 Flexible usage – CLI or Python API
- ⏱ Progress bars – real-time status with
tqdm
📦 Installation
pip install audiomaker
# or
git clone https://github.com/ankushrathour/audiomaker.git
cd audiomaker
pip install -e .
🚀 Usage
1️⃣ Command-Line Interface (CLI)
audiomaker --input file.txt --output file.mp3 --chunk_size 3000 --voice en-US-AriaNeural
Arguments:
Flag Description Default
-
input Path to input text file Required
-
output Path to save final audio output.mp3
-
chunk_size Number of words per TTS chunk 3000
-
voice Edge-TTS voice name en-US-AriaNeural
-
temp_dir Directory for temporary audio chunks audio_parts
2️⃣ Python API from audiomaker import text_to_audio
# Load text from file
with open("file.txt", "r", encoding="utf-8") as f:
text = f.read()
# Convert to audio
text_to_audio(
text=text, output_path="output.mp3",
chunk_size=3000, voice="en-US-AriaNeural", temp_dir="audio_parts"
)
🎨 Example Voices
Some popular Microsoft Edge-TTS voices you can use:
- en-US-AriaNeural
- en-GB-RyanNeural
- en-IN-NeerjaNeural
- en-AU-NatashaNeural
For a complete list of available voices, please refer to the full list of Voices.
⚠️ Notes
Edge-TTS requires an internet connection to access Microsoft’s speech services. Chunk size may need to be adjusted depending on the voice and text formatting. Intermediate audio files are stored in temp_dir and can be deleted after processing.
💡 Tagline
AudioMaker – Unlimited text, one seamless voice.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file audiomaker-1.0.0.tar.gz.
File metadata
- Download URL: audiomaker-1.0.0.tar.gz
- Upload date:
- Size: 4.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.14.0b2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f19cf6890c0be86b7644a7136cbc3c6a85c3955045cd820979da06b408f38aaa
|
|
| MD5 |
cc2819b03ea5576c222449178e3e3997
|
|
| BLAKE2b-256 |
db131cc78c1b2954787edef19771453d0b6f007b16cca5a5b0c7653eda8be198
|
File details
Details for the file audiomaker-1.0.0-py3-none-any.whl.
File metadata
- Download URL: audiomaker-1.0.0-py3-none-any.whl
- Upload date:
- Size: 5.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.14.0b2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
18bd644e50c7292a144c2d145e6d00d5bf2c64d07efadaf89af7aaddc819f7b6
|
|
| MD5 |
72625393f7c7536bcb9434f3d7693a35
|
|
| BLAKE2b-256 |
817bd36fce2b1674019fb00a515bfa2389237d99ac5c0cacdf8d8bf8cba6b66a
|