Skip to main content

Automate chunking long texts to produce a single audio file from text-to-speech APIs

Project description

tts-joinery

tts-joinery is a Python library and CLI tool to work around length limitations in text-to-speech APIs.

Since currently-popular APIs are limited to 4096 characters, this library will:

  • Chunk the input text into sentences using the NLTK Punkt module (for better audio by avoiding segments split in the middle of a word or sentence).
  • Run each chunk through the TTS API
  • Join together the resulting output to produce a single MP3 file

Currently only the OpenAI API is supported, with the intent to add more in the future.

Installation

pip install tts-joinery

or use pipx to install as a standalone tool.

Requires ffmpeg for the audio file processing.

Installation may vary depending on your system. On Linux you can use your system package manager. On Mac brew install ffmpeg should work.

Usage

Command-Line Interface (CLI)

The CLI expects to find an OpenAI API Key in a OPENAI_API_KEY environment variable, or in a .env file.

Syntax

ttsjoin [OPTIONS] [COMMAND]

Options

Options:
--input-file FILENAME   Plaintext file to process into speech, otherwise stdin
--output-file FILENAME  MP3 result, otherwise stdout
--model TEXT            Slug of the text-to-speech model to be used (tts-1, tts-1-hd or gpt-4o-mini-tts)
--service TEXT          API service (currently only supports openai)
--voice TEXT            Slug of the voice to be used
--instructions TEXT     Voice instructions (only for gpt-4o-mini-tts model)
--no-cache BOOLEAN      Disable caching
--help                  Show this message and exit.

Commands:
  cache [clear, show]

Examples

  1. Using an input file and specifying an output file:
# Basic usage with tts-1 model
ttsjoin --input-file input.txt --output-file output.mp3 --model tts-1 --service openai --voice onyx

# Using the new gpt-4o-mini-tts model with instructions
ttsjoin --input-file input.txt --output-file output.mp3 --model gpt-4o-mini-tts --voice ballad --instructions "Speak in a calm, soothing voice"
  1. Using stdin and stdout with default options:
echo "Your text to be processed" | ttsjoin > output.mp3
  1. Each chunk of text is cached for performance when running the same text multiple times, this can be disabled:
ttsjoin --input-file input.txt --output-file output.mp3 --no-cache
  1. Clear cache directory
ttsjoin cache clear

Python Library

You can also use tts-joinery as part of your Python project:

import nltk

from joinery.op import JoinOp
from joinery.api.openai import OpenAIApi

# Only need to download once, handled for you automatically in the CLI
nltk.download('punkt_tab', quiet=True)

tts = JoinOp(
    text='This is only a test!',
    api=OpenAIApi(
        model='tts-1-hd',  # or 'gpt-4o-mini-tts' for the new model
        voice='onyx',
        api_key=OPENAI_API_KEY,
        instructions='Speak in a calm, soothing voice',  # Optional, only for gpt-4o-mini-tts
    ),
)

tts.process_to_file('output.mp3')

Changelog

v1.0.5 (2025-03-20)

  • Added support for OpenAI's gpt-4o-mini-tts model and instructions parameter

v1.0.4 (2024-10-11)

  • Fixed issue with nltk dependency #4
  • Model, voice, and service CLI params are now case-insensitive

v1.0.3 (2024-10-05)

  • Added cache management commands to cli
  • Fixed a bug when running
  • Added end-to-end tests

v1.0.2 (2024-10-03)

  • Fixed crash when running with caching disabled (#3)

Contributing

Contributions welcome, particularly other TTS APIs, check the issues beforehand and feel free to open a PR. Code is formatted with Black.

Test can be run manually. Suite includes end-to-end tests with live API calls, ensure you have an OPENAI_API_KEY set in .env.test, and run pytest. You can install development dependencies with pip install -e .[test]

Contributors

Special thanks to:

License

This project is licensed under the MIT License.

Copyright 2024, Adrien Delessert

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tts_joinery-1.0.5.tar.gz (12.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tts_joinery-1.0.5-py3-none-any.whl (11.1 kB view details)

Uploaded Python 3

File details

Details for the file tts_joinery-1.0.5.tar.gz.

File metadata

  • Download URL: tts_joinery-1.0.5.tar.gz
  • Upload date:
  • Size: 12.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.7

File hashes

Hashes for tts_joinery-1.0.5.tar.gz
Algorithm Hash digest
SHA256 c8c5605af23c4840ab664d75aa4f3bf79bf7c88e9583f1b75276ab76c3e3f63c
MD5 e1d309bf4968460fa341e3f84c500f96
BLAKE2b-256 76d265113bde8e1a3e4c27a9ed2276f72dae6a407b002e019ca87628fa801978

See more details on using hashes here.

File details

Details for the file tts_joinery-1.0.5-py3-none-any.whl.

File metadata

  • Download URL: tts_joinery-1.0.5-py3-none-any.whl
  • Upload date:
  • Size: 11.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.7

File hashes

Hashes for tts_joinery-1.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 6fb55d7d1c7692b944eb235a9b490bd77926321e1601dfb8b8668ed39731b93f
MD5 63b9a1a4dc0b09d5f61770c283ab51cf
BLAKE2b-256 339ffaa2c437f186dcee0e5d304714243a67aa5614041cf44aec42746d67a499

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page