Skip to main content

Unlimited text-to-speech generation with chunking and seamless merging

Project description

🎙️ AudioMaker

PyPI version License: MIT Python Version

AudioMaker is a Python package for generating seamless, long-form audio from massive text inputs.
Unlike traditional TTS tools, AudioMaker can handle book-length content (even 4+ hours) by splitting text into chunks, synthesizing each chunk, and merging them into a single audio file.


✨ Features

  • 📚 Handles huge text – turn entire books into one MP3
  • 🧩 Chunking system – bypasses TTS length limits automatically
  • 🔗 Seamless merging – no awkward pauses or breaks
  • 🎙️ Custom voices – choose from Microsoft Edge-TTS voices
  • 🛠 Flexible usage – CLI or Python API
  • Progress bars – real-time status with tqdm

📦 Installation

pip install audiomaker

# or

git clone https://github.com/ankushrathour/audiomaker.git
cd audiomaker
pip install -e .

🚀 Usage

1️⃣ Command-Line Interface (CLI)

audiomaker --input file.txt --output file.mp3 --chunk_size 3000 --voice en-US-AriaNeural

Arguments:

Flag Description Default
  • input Path to input text file Required

  • output Path to save final audio output.mp3

  • chunk_size Number of words per TTS chunk 3000

  • voice Edge-TTS voice name en-US-AriaNeural

  • temp_dir Directory for temporary audio chunks audio_parts

2️⃣ Python API from audiomaker import text_to_audio

# Load text from file
with open("file.txt", "r", encoding="utf-8") as f:
    text = f.read()

# Convert to audio
text_to_audio(
    text=text, output_path="output.mp3",
    chunk_size=3000, voice="en-US-AriaNeural", temp_dir="audio_parts"
)

🎨 Example Voices

Some popular Microsoft Edge-TTS voices you can use:

  • en-US-AriaNeural
  • en-GB-RyanNeural
  • en-IN-NeerjaNeural
  • en-AU-NatashaNeural

For a complete list of available voices, please refer to the full list of Voices.

⚠️ Notes

Edge-TTS requires an internet connection to access Microsoft’s speech services. Chunk size may need to be adjusted depending on the voice and text formatting. Intermediate audio files are stored in temp_dir and can be deleted after processing.

💡 Tagline

AudioMaker – Unlimited text, one seamless voice.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

audiomaker-1.0.0.tar.gz (4.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

audiomaker-1.0.0-py3-none-any.whl (5.3 kB view details)

Uploaded Python 3

File details

Details for the file audiomaker-1.0.0.tar.gz.

File metadata

  • Download URL: audiomaker-1.0.0.tar.gz
  • Upload date:
  • Size: 4.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.14.0b2

File hashes

Hashes for audiomaker-1.0.0.tar.gz
Algorithm Hash digest
SHA256 f19cf6890c0be86b7644a7136cbc3c6a85c3955045cd820979da06b408f38aaa
MD5 cc2819b03ea5576c222449178e3e3997
BLAKE2b-256 db131cc78c1b2954787edef19771453d0b6f007b16cca5a5b0c7653eda8be198

See more details on using hashes here.

File details

Details for the file audiomaker-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: audiomaker-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 5.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.14.0b2

File hashes

Hashes for audiomaker-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 18bd644e50c7292a144c2d145e6d00d5bf2c64d07efadaf89af7aaddc819f7b6
MD5 72625393f7c7536bcb9434f3d7693a35
BLAKE2b-256 817bd36fce2b1674019fb00a515bfa2389237d99ac5c0cacdf8d8bf8cba6b66a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page