Skip to main content

Library to reduce latency in voice generations from LLM chat completion streams

Project description

LLM Voice

Library to reduce latency in voice generations from LLM chat completion streams.

This lets you generate voices using completely local LLM models, such as Ollama and TTS clients, such as Apple Say and Google Text-to-Speech with the same speed as privately created assistants such as OpenAI.

Installation and Setup

  1. Install the package from PyPI with:

    pip install llm-voice
    
  2. Copy the .env.example file to .env and fill in your OpenAI API key if you want to use OpenAI along with the model name for the Ollama/OpenAI model you want to use.

  3. Take a look at one of the examples to start generating voice responses in realtime.

Example Usage

The example below can be found in the examples directory.

# Setup output device, TTS client and LLM client.
devices: list[AudioDevice] = AudioDevices.get_list_of_devices(
   device_type=AudioDeviceType.OUTPUT
)

# Pick the first output device (usually computer builtin speakers if nothing else if connected).
output_device: AudioDevice = devices[0]

# Change to another TTS client depending on your needs and desires.
tts_client: TextToSpeechClient = OpenAITextToSpeechClient()

# Change to another LLM client depending on your needs and desires.
llm_client: LLMClient = OllamaClient(
   model_name=MODEL_NAME,
)

# Define messages to send to the LLM.
messages: list[ChatMessage] = [
   ChatMessage(
      role=MessageRole.SYSTEM,
      content="You are a helpful assistant named Alfred.",
   ),
   ChatMessage(role=MessageRole.USER, content="Hey there what is your name?"),
]

# Use the LLM to generate a response.
chat_stream: Iterator[str] = llm_client.generate_chat_completion_stream(
   messages=messages,
)

# Create the voice responder and speak the response.
voice_responder_fast = VoiceResponderFast(
   text_to_speech_client=tts_client,
   output_device=output_device,
)

# Will speak each sentence back to back as it is available.
voice_responder_fast.respond(chat_stream)

Install From Source

pip install poetry
poetry install

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_voice-0.0.1.tar.gz (12.7 kB view details)

Uploaded Source

Built Distribution

llm_voice-0.0.1-py3-none-any.whl (20.8 kB view details)

Uploaded Python 3

File details

Details for the file llm_voice-0.0.1.tar.gz.

File metadata

  • Download URL: llm_voice-0.0.1.tar.gz
  • Upload date:
  • Size: 12.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.12.2 Darwin/23.5.0

File hashes

Hashes for llm_voice-0.0.1.tar.gz
Algorithm Hash digest
SHA256 7d110c0f509c5c5362797aa829432d86cd51066c2ec77acbafce42f9d12ce1e1
MD5 fb29fe66a3e5c3bfd2608e6fe1fbf0d0
BLAKE2b-256 e44258e7210cae570c0aad7576f56c0bb12d74859fde485108bc930d88b93a77

See more details on using hashes here.

File details

Details for the file llm_voice-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: llm_voice-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 20.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.12.2 Darwin/23.5.0

File hashes

Hashes for llm_voice-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4748ffdd1c84bb91d3b8ce8b60100fe6f83f4a1a6bb17bf568fb51f9a76baf21
MD5 920252d06701ecc174c674e18628342a
BLAKE2b-256 d98463d32108f85066fd807b67dd7bda8053b80e5e035c1c5b2cfd8cdc0c9476

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page