A powerful and easy-to-use Python library for generating natural-sounding speech using OpenAI text-to-speech capabilities.
Project description
๐๏ธ openai-tts
A powerful and easy-to-use Python library for generating natural-sounding speech using OpenAI's text-to-speech capabilities.
โจ Features
- ๐ Convert text to high-quality speech using OpenAI's TTS API
- ๐ญ Multiple voice options (Alloy, Ash, Ballad, Coral, Echo, Fable, Onyx, Nova, Sage, Shimmer, Verse)
- ๐ Concurrent processing for faster generation of audio files
- ๐งฉ Modular and extensible architecture for adding new providers
- ๐ Intelligent sentence splitting for natural-sounding speech
- ๐ ๏ธ Comprehensive error handling and retry mechanism
๐จ Disclaimer
IMPORTANT: This library is developed for educational and research purposes only. It is not affiliated with, endorsed by, or connected to OpenAI in any way. Using this library to circumvent API restrictions, terms of service, or to access services without proper authorization may violate OpenAI's terms of service.
The developers of this library are not responsible for any misuse or violations of terms of service that may result from using this code. Users are solely responsible for ensuring their use of this library complies with all applicable terms of service and laws.
Project Structure
openai-tts/
โโโ LICENCE
โโโ README.md
โโโ requirements.txt
โโโ setup.py
โโโ example.py
โโโ openai_tts/
โ โโโ __init__.py
โ โโโ config.py
โ โโโ utils.py
โ โโโ exceptions.py
โ โโโ providers/
โ โ โโโ __init__.py
โ โ โโโ base.py
โโโโโโโโโโโ openai.py
๐ ๏ธ Installation
Using PyPI (Recommended)
pip install openai-tts
Clone Locally
git clone https://github.com/sujalrajpoot/openai-tts.git
cd openai-tts
pip install -r requirements.txt
๐ Dependencies
- Python 3.8+
- curl-cffi
๐ Quick Start
from openai_tts import OpenaiTTS
from openai_tts.config import VoiceType
# Initialize the TTS engine
tts = OpenaiTTS()
# Generate speech with default settings
text = "Hello world! This is a demonstration of the OpenAI TTS library."
tts.speak(text) # Saves to default "output.mp3"
# Try different voices
tts.speak(text, voice=VoiceType.ECHO, output_path="echo_voice.mp3")
tts.speak(text, voice=VoiceType.NOVA, output_path="nova_voice.mp3")
# Control verbosity
tts.speak(text, verbose=False, output_path="quiet_output.mp3")
๐ฏ How It Works
The OpenAI TTS Library operates through a series of sophisticated steps:
-
Text Preprocessing: The input text is divided into natural sentences using our custom SentenceTokenizer, ensuring that the generated speech will sound natural with appropriate pauses.
-
Parallel Processing: Each sentence is processed concurrently using a thread pool, maximizing efficiency especially for longer texts.
-
API Interaction: The library communicates with OpenAI's TTS API, handling authentication, request formatting, and response processing.
-
Error Handling: Robust retry mechanisms and error handling ensure reliability even when network issues occur.
-
Output Generation: The audio chunks are assembled in the correct order and saved to the specified output file.
๐ Voice Options
Choose from a variety of voice options:
| ALLOY | ASH | BALLAD | CORAL | ECHO | FABLE | ONYX | NOVA | SAGE | SHIMMER | VERSE |
๐ Advanced Usage
Custom Configuration
from openai_tts import OpenaiTTS, TTSConfig
from openai_tts.config import VoiceType
# Create custom configuration
config = TTSConfig(
timeout=30, # Increase timeout to 30 seconds
verbose=True, # Print detailed progress
output_path="custom.mp3", # Default output path
voice=VoiceType.NOVA # Default voice
)
# Initialize with custom config
tts = OpenaiTTS(config=config)
# Use the configured TTS
tts.speak("This text will be converted using the custom configuration.")
# Override specific settings for a single call
tts.speak(
"This will use different settings just for this call.",
voice=VoiceType.ECHO,
output_path="override.mp3"
)
Error Handling
from openai_tts import OpenaiTTS
from openai_tts.exceptions import TTSException
tts = OpenaiTTS()
try:
tts.speak("This is a test of error handling.")
except TTSException as e:
print(f"An error occurred: {e}")
๐ค Contributing
Contributions are welcome! Here's how you can help:
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature
- Make your changes
- Run the tests:
python -m unittest discover
- Commit your changes:
git commit -m 'Add amazing feature'
- Push to your branch:
git push origin feature/amazing-feature
- Open a Pull Request
Please ensure your code follows the project's style guide and includes appropriate tests.
๐ Performance Considerations
The library is designed to handle large text inputs efficiently through parallel processing. However, very large texts may still take considerable time to process due to API rate limits and processing requirements.
For optimal performance:
- Split very large texts into reasonable chunks before processing
- Consider running resource-intensive operations in a background process
- Use the
verbose=Trueoption to monitor progress during long operations
๐ Security
This library communicates with external services. Always be mindful of:
- The content you're sending to the API
- Where you're storing the generated audio files
- Who has access to your implementation
๐ก Use Cases
- ๐๏ธ Content Creation: Generate voiceovers for videos, podcasts, or presentations
- ๐ Accessibility: Convert written content to audio for accessibility purposes
- ๐ค Chatbots and Virtual Assistants: Give your applications a voice
- ๐ฎ Gaming: Create dynamic dialogue for game characters
- ๐ฑ Mobile Apps: Add speech capabilities to your applications
โ FAQ
Q: Is this an official OpenAI library?
A: No, this is an unofficial, community-developed library for educational purposes.
Q: Do I need an OpenAI account to use this?
A: This library uses OpenAI's public TTS interface and does not require an API key.
Q: Can I use this for commercial projects?
A: Please refer to OpenAI's terms of service regarding the usage of their TTS capabilities. This library is for educational purposes only.
Q: How can I improve the speech quality?
A: Try different voices, ensure proper punctuation in your text, and break long paragraphs into natural sentences.
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐ If you find this library helpful, please consider starring the repository on GitHub!
๐ง Questions or suggestions? Open an issue on GitHub or contact the maintainers.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file openai_tts-2.0.0.tar.gz.
File metadata
- Download URL: openai_tts-2.0.0.tar.gz
- Upload date:
- Size: 14.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
697a048f80e3ae07e94b836cf87ba4bdc0c2693613f0f02e632b7a5b4d3033b8
|
|
| MD5 |
8c43d4fdbc8dc57ed56b412be5d301fd
|
|
| BLAKE2b-256 |
c5449445984338c1b8b94dded6a6bccb364be33211db0d4faefa03a73bde1f1a
|
File details
Details for the file openai_tts-2.0.0-py3-none-any.whl.
File metadata
- Download URL: openai_tts-2.0.0-py3-none-any.whl
- Upload date:
- Size: 12.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
71ab291542b9f74a15b3d53196aa8e5147453c5ff2ae44fd70af0a373fb72c00
|
|
| MD5 |
7bc4053dcc45c77486334323000b47bb
|
|
| BLAKE2b-256 |
f9bbc5cd80c3684e98d47588e22e5e8b4b43f5cd2d06b41a365f7a2915dff4b3
|