Skip to main content

The all-in-one voice SDK

Project description

  vocode

Build voice-based LLM apps in minutes

Vocode is an open source library that makes it easy to build voice-based LLM apps. Using Vocode, you can build real-time streaming conversations with LLMs and deploy them to phone calls, Zoom meetings, and more. You can also build personal assistants or apps like voice-based chess. Vocode provides easy abstractions and integrations so that everything you need is in a single library.

We're actively looking for community maintainers, so please reach out if interested!

⭐️ Features

Check out our React SDK here!

🫂 Contribution and Roadmap

We're an open source project and are extremely open to contributors adding new features, integrations, and documentation! Please don't hesitate to reach out and get started building with us.

For more information on contributing, see our Contribution Guide.

And check out our Roadmap.

We'd love to talk to you on Discord about new ideas and contributing!

🚀 Quickstart

pip install vocode
import asyncio
import signal

from pydantic_settings import BaseSettings, SettingsConfigDict

from vocode.helpers import create_streaming_microphone_input_and_speaker_output
from vocode.logging import configure_pretty_logging
from vocode.streaming.agent.chat_gpt_agent import ChatGPTAgent
from vocode.streaming.models.agent import ChatGPTAgentConfig
from vocode.streaming.models.message import BaseMessage
from vocode.streaming.models.synthesizer import AzureSynthesizerConfig
from vocode.streaming.models.transcriber import (
    DeepgramTranscriberConfig,
    PunctuationEndpointingConfig,
)
from vocode.streaming.streaming_conversation import StreamingConversation
from vocode.streaming.synthesizer.azure_synthesizer import AzureSynthesizer
from vocode.streaming.transcriber.deepgram_transcriber import DeepgramTranscriber

configure_pretty_logging()


class Settings(BaseSettings):
    """
    Settings for the streaming conversation quickstart.
    These parameters can be configured with environment variables.
    """

    openai_api_key: str = "ENTER_YOUR_OPENAI_API_KEY_HERE"
    azure_speech_key: str = "ENTER_YOUR_AZURE_KEY_HERE"
    deepgram_api_key: str = "ENTER_YOUR_DEEPGRAM_API_KEY_HERE"

    azure_speech_region: str = "eastus"

    # This means a .env file can be used to overload these settings
    # ex: "OPENAI_API_KEY=my_key" will set openai_api_key over the default above
    model_config = SettingsConfigDict(
        env_file=".env",
        env_file_encoding="utf-8",
        extra="ignore",
    )


settings = Settings()


async def main():
    (
        microphone_input,
        speaker_output,
    ) = create_streaming_microphone_input_and_speaker_output(
        use_default_devices=False,
    )

    conversation = StreamingConversation(
        output_device=speaker_output,
        transcriber=DeepgramTranscriber(
            DeepgramTranscriberConfig.from_input_device(
                microphone_input,
                endpointing_config=PunctuationEndpointingConfig(),
                api_key=settings.deepgram_api_key,
            ),
        ),
        agent=ChatGPTAgent(
            ChatGPTAgentConfig(
                openai_api_key=settings.openai_api_key,
                initial_message=BaseMessage(text="What up"),
                prompt_preamble="""The AI is having a pleasant conversation about life""",
            )
        ),
        synthesizer=AzureSynthesizer(
            AzureSynthesizerConfig.from_output_device(speaker_output),
            azure_speech_key=settings.azure_speech_key,
            azure_speech_region=settings.azure_speech_region,
        ),
    )
    await conversation.start()
    print("Conversation started, press Ctrl+C to end")
    signal.signal(signal.SIGINT, lambda _0, _1: asyncio.create_task(conversation.terminate()))
    while conversation.is_active():
        chunk = await microphone_input.get_audio()
        conversation.receive_audio(chunk)


if __name__ == "__main__":
    asyncio.run(main())

📞 Phone call quickstarts

🌱 Documentation

docs.vocode.dev

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vocode-0.1.114a0.tar.gz (4.7 MB view details)

Uploaded Source

Built Distribution

vocode-0.1.114a0-py3-none-any.whl (4.8 MB view details)

Uploaded Python 3

File details

Details for the file vocode-0.1.114a0.tar.gz.

File metadata

  • Download URL: vocode-0.1.114a0.tar.gz
  • Upload date:
  • Size: 4.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.2 CPython/3.11.9 Linux/6.5.0-1022-azure

File hashes

Hashes for vocode-0.1.114a0.tar.gz
Algorithm Hash digest
SHA256 5ad7205d3f10ee32e23bbce7b3a5a7ab75675ae7597b363bf68064bea6352f01
MD5 773e6b6446af574f83856ea0fa82a791
BLAKE2b-256 acf6055c13261e549d85780599d4bcd5b845955d2925c0456fe6a36265db4516

See more details on using hashes here.

File details

Details for the file vocode-0.1.114a0-py3-none-any.whl.

File metadata

  • Download URL: vocode-0.1.114a0-py3-none-any.whl
  • Upload date:
  • Size: 4.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.2 CPython/3.11.9 Linux/6.5.0-1022-azure

File hashes

Hashes for vocode-0.1.114a0-py3-none-any.whl
Algorithm Hash digest
SHA256 92a0bf8d311e44da8701e6bbd8f39508f74e3ea12bd651c43bc66f118cdf95b9
MD5 1a0506be1584d36c7f6d6d5a978d80bc
BLAKE2b-256 f103dd0044b07d04100309e0e38c8b195777ee952a9756889fe3f282d8aae6f2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page