A library for real-time text to speech processing using OpenAI API.

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

OpenAI VoiceStream

OpenAI VoiceStream is a Python library that provides real-time text-to-speech functionality using the OpenAI API. It allows you to process text and token streams and generate audio output on-the-fly, making it suitable for integration with language models that generate responses in segments.

Features

Real-time text-to-speech conversion
Support for processing text and token streams
Multiple voice options (alloy, echo, fable, onyx, nova, shimmer)
Thread-safe processing for smooth operation
Audio fading to avoid abrupt stops

Installation

You can install OpenAI VoiceStream using pip:

pip install openai-voicestream

Usage

Prerequisites

Before using OpenAI VoiceStream, make sure you have an OpenAI API key. Set the API key as an environment variable:

export OPENAI_API_KEY='your_openai_api_key'

Example Code

Here's 2 examples of how to use OpenAI VoiceStream passing in text and for a token stream:

import os
import time
from openai_voicestream import VoiceProcessor

# Retrieve your OpenAI API key from environment variables
api_key = os.getenv("OPENAI_API_KEY")

# Initialize the VoiceProcessor with the API key and desired voice
processor = VoiceProcessor(api_key, voice="nova")  # Using the "nova" voice

# Example text with paragraphs to be processed
text = """This is an example using the nova voice.

The nova voice provides a different tone and style compared to the default voice.

You can experiment with different voices to find the one that suits your needs."""

# Add the text to the processing queue
processor.add_text_to_queue(text)

# Wait for all processing to complete before exiting
processor.wait_for_completion()

import os
import time
from openai_voicestream import VoiceProcessor

# Retrieve your OpenAI API key from environment variables
api_key = os.getenv("OPENAI_API_KEY")

# Initialize the VoiceProcessor with the API key and desired voice
processor = VoiceProcessor(api_key, voice="shimmer")  # Using the "shimmer" voice

# Example tokens being added to the processing queue
tokens = [
    "This is an example of processing a stream of tokens.",
    " The tokens are gradually added to the processor,",
    " simulating a real-time scenario where text is generated incrementally.",
    "\n\nThe processor will handle the tokens and generate audio on-the-fly,",
    " providing a seamless text-to-speech experience."
]

# Process the tokens in a streaming manner (this can be replaced with an tokenstream)
for token in tokens:
    processor.add_token(token)

# Finalize any remaining tokens in the buffer
processor.finalize_tokens()

# Wait for all processing to complete before exiting
processor.wait_for_completion()

API Reference

VoiceProcessor

The main class for processing text and generating audio.

`init(self, api_key, voice='alloy')`

Initializes the VoiceProcessor with the provided API key and voice.

api_key (str): The API key for accessing the OpenAI API.
voice (str or int): The voice to use for text-to-speech. Can be specified by name or index.

`add_text_to_queue(self, text)`

Adds text to the processing queue.

text (str): The text to add to the queue.

`add_token(self, token)`

Adds a token to the buffer and processes it if needed.

token (str): The token to add to the buffer.

`finalize_tokens(self)`

Finalizes any remaining tokens in the buffer.

`wait_for_completion(self)`

Waits for all sentences to be processed.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Contributing

Contributions are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request.

Acknowledgements

OpenAI VoiceStream is built using the OpenAI API and relies on the following libraries:

Troubleshooting

If you encounter any issues while using OpenAI VoiceStream, here are a few things you can try:

Make sure you have set the OPENAI_API_KEY environment variable correctly with your OpenAI API key.
Check that you have a stable internet connection to communicate with the OpenAI API.
If you encounter any errors or exceptions, please check the error message and consult the documentation or seek support.

FAQ

Can I use OpenAI VoiceStream for commercial purposes?

Yes, you can use OpenAI VoiceStream for commercial purposes, subject to the terms and conditions of the OpenAI API usage. Make sure to review and comply with OpenAI's usage policies.

How can I customize the voice output?

OpenAI VoiceStream provides multiple voice options that you can choose from. You can specify the desired voice by passing the voice name or index to the VoiceProcessor constructor. Available voices include: alloy, echo, fable, onyx, nova, and shimmer.

Can I control the speed or pitch of the generated audio?

Currently, OpenAI VoiceStream does not provide direct control over the speed or pitch of the generated audio. The audio is generated based on the selected voice and the input text. If you require more advanced audio customization, you may need to explore other text-to-speech libraries or APIs.

Is there a limit on the amount of text I can process?

The amount of text you can process depends on the limitations of the OpenAI API. OpenAI VoiceStream processes text in chunks, so it can handle larger text inputs by breaking them down into smaller segments. However, keep in mind that processing large amounts of text may result in longer processing times and higher API usage.

Support

If you have any questions, issues, or feature requests, please open an issue on the GitHub repository. I appreciate your feedback and will do our best to assist you.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.1.2

Jun 1, 2024

This version

0.1.1

Jun 1, 2024

0.1.0

Jun 1, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openai_voicestream-0.1.1.tar.gz (6.3 kB view hashes)

Uploaded Jun 1, 2024 Source

Built Distribution

openai_voicestream-0.1.1-py3-none-any.whl (6.8 kB view hashes)

Uploaded Jun 1, 2024 Python 3

Hashes for openai_voicestream-0.1.1.tar.gz

Hashes for openai_voicestream-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`788f655ad04afd4e1a1517a4f70cf13075b2304d82ae2810f4d4b29c9bba283b`
MD5	`9afb8d4375f7718f3a62b696bc4ebd8f`
BLAKE2b-256	`d7bd8a8e63b26f439a8e7ce6b15ea78f281e6389bf589e6fb38a9096b5fb32a7`

Hashes for openai_voicestream-0.1.1-py3-none-any.whl

Hashes for openai_voicestream-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f8027e6f921a642f67fa5a52b93f2dc2bd489d198b79713d335eb9d5f525bbd8`
MD5	`634bcd546ab0244f39ab12af5498d079`
BLAKE2b-256	`8965a72f9d4dcc0cef67678e716c64a6844be97ef538e31f1e55239d4a767d6e`

openai-voicestream 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

OpenAI VoiceStream

Features

Installation

Usage

Prerequisites

Example Code

API Reference

VoiceProcessor

__init__(self, api_key, voice='alloy')

add_text_to_queue(self, text)

add_token(self, token)

finalize_tokens(self)

wait_for_completion(self)

License

Contributing

Acknowledgements

Troubleshooting

FAQ

Can I use OpenAI VoiceStream for commercial purposes?

How can I customize the voice output?

Can I control the speed or pitch of the generated audio?

Is there a limit on the amount of text I can process?

Support

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

`init(self, api_key, voice='alloy')`

`add_text_to_queue(self, text)`

`add_token(self, token)`

`finalize_tokens(self)`

`wait_for_completion(self)`