Library to reduce latency in voice generations from LLM chat completion streams
Project description
LLM Voice
Library to reduce latency in voice generations from LLM chat completion streams.
This lets you generate voices using completely local LLM models, such as Ollama and TTS clients, such as Apple Say and Google Text-to-Speech with the same speed as privately created assistants such as OpenAI.
Installation and Setup
-
Install the package from PyPI with:
pip install llm-voice
-
Copy the .env.example file to .env and fill in your OpenAI API key if you want to use OpenAI along with the model name for the Ollama/OpenAI model you want to use.
-
Take a look at one of the examples to start generating voice responses in realtime.
Example Usage
The example below can be found in the examples directory.
# Setup output device, TTS client and LLM client.
devices: list[AudioDevice] = AudioDevices.get_list_of_devices(
device_type=AudioDeviceType.OUTPUT
)
# Pick the first output device (usually computer builtin speakers if nothing else if connected).
output_device: AudioDevice = devices[0]
# Change to another TTS client depending on your needs and desires.
tts_client: TextToSpeechClient = OpenAITextToSpeechClient()
# Change to another LLM client depending on your needs and desires.
llm_client: LLMClient = OllamaClient(
model_name=MODEL_NAME,
)
# Define messages to send to the LLM.
messages: list[ChatMessage] = [
ChatMessage(
role=MessageRole.SYSTEM,
content="You are a helpful assistant named Alfred.",
),
ChatMessage(role=MessageRole.USER, content="Hey there what is your name?"),
]
# Use the LLM to generate a response.
chat_stream: Iterator[str] = llm_client.generate_chat_completion_stream(
messages=messages,
)
# Create the voice responder and speak the response.
voice_responder_fast = VoiceResponderFast(
text_to_speech_client=tts_client,
output_device=output_device,
)
# Will speak each sentence back to back as it is available.
voice_responder_fast.respond(chat_stream)
Install From Source
pip install poetry
poetry install
License
This project is licensed under the MIT License. See the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for llm_voice-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4748ffdd1c84bb91d3b8ce8b60100fe6f83f4a1a6bb17bf568fb51f9a76baf21 |
|
MD5 | 920252d06701ecc174c674e18628342a |
|
BLAKE2b-256 | d98463d32108f85066fd807b67dd7bda8053b80e5e035c1c5b2cfd8cdc0c9476 |