Skip to main content

LlamaIndex x OpenAI Realtime integration

Project description

LlamaIndex x OpenAI Realtime Integration

This package is an integration for the OpenAI Realtime Conversation service.

To install the package, run:

python3 -m pip install llama-index-voice-agents-openai

And, if you want to run it, you can refer to the simple example down here (in this case, the audio input/output are the same as the local device you are running the script on):

import os
import signal
import time
import logging

from typing import List
from dotenv import load_dotenv
from llama_index.voice_agents.openai import OpenAIVoiceAgent
from llama_index.core.voice_agents import BaseVoiceAgentEvent
from llama_index.core.llms import ChatMessage, TextBlock
from llama_index.core.tools import FunctionTool

logging.basicConfig(
    level=logging.INFO, format="%(asctime)s [%(levelname)s] %(message)s"
)

# Load environment variables from a .env file
load_dotenv()

quitFlag = False


# use filter functions to export messages and events without your terminal being swamped by base64-encoded audio bytes :)
def filter_events(
    events: List[BaseVoiceAgentEvent],
) -> List[BaseVoiceAgentEvent]:
    evs = []
    for event in events:
        if "audio" in event.type_t and "transcript" not in event.type_t:
            if "delta" in event.type_t:
                event.delta = ""
                evs.append(event)
        else:
            evs.append(event)
    return evs


def filter_messages(messages: List[ChatMessage]) -> List[ChatMessage]:
    msgs = []
    for message in messages:
        msg = ChatMessage(role=message.role, blocks=[])
        for b in message.blocks:
            if isinstance(b, TextBlock):
                msg.blocks.append(b)
        if len(msg.blocks) > 0:
            msgs.append(msg)
    return msgs


def signal_handler(sig, frame):
    """Handle Ctrl+C and initiate graceful shutdown."""
    global quitFlag
    quitFlag = True


async def main():
    # this is just a mock tool
    def get_weather(location: str):
        """
        Get the weather for a location
        """
        return f"The weather in {location} is sunny, there are 32°C, wind is 3 km/h toward west, and humidity is at 34%."

    api_key = os.getenv("OPENAI_API_KEY")

    if not api_key:
        return
    tools = [
        FunctionTool.from_defaults(
            fn=get_weather,
            name="get_weather",
            description="Get the current weather...",
        )
    ]
    # use custom models and tools!
    conversation = OpenAIVoiceAgent(
        api_key=api_key,
        tools=tools,
        model="gpt-4o-mini-realtime-preview-2024-12-17",
    )

    signal.signal(
        signal.SIGINT,
        lambda sig, frame: signal_handler(sig, frame),
    )

    try:
        # customize the model with instructions and other arguments you can find here: https://platform.openai.com/docs/api-reference/realtime-client-events/session/update
        await conversation.start(
            instructions="You are a very helpful assistant. You can use the 'get_weather' tool to get weather information about a location: you just have to input the location to the tool. Please execute the tool whenever you are asked about the weather of a location.",
        )
        while not quitFlag:
            time.sleep(0.1)
        if quitFlag:
            await conversation.interrupt()
            await conversation.stop()

    except Exception as e:
        logging.error(f"Error in main loop: {e}")
        await conversation.interrupt()
        await conversation.stop()

    finally:
        logging.info("Exiting main.")
        print("Messages:")
        print(conversation.export_messages(filter=filter_messages))
        print("Events:")
        print(conversation.export_events(filter=filter_events))
        await conversation.stop()  # Ensures cleanup if any error occurs


if __name__ == "__main__":
    import asyncio

    asyncio.run(main())
    # remember that YOU have to start the conversation!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_voice_agents_openai-0.2.1.tar.gz (10.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file llama_index_voice_agents_openai-0.2.1.tar.gz.

File metadata

File hashes

Hashes for llama_index_voice_agents_openai-0.2.1.tar.gz
Algorithm Hash digest
SHA256 8426a2573be7fe5f41f8ac409d440ad4a596e45cf5052c7764e52db3e3919a4c
MD5 98d98573141e3e939e9121a02019b33a
BLAKE2b-256 d643779ddf8f2153d0573672d8739890f59274599b00f9d8e0c5f5ec6197fdc2

See more details on using hashes here.

File details

Details for the file llama_index_voice_agents_openai-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for llama_index_voice_agents_openai-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 32009a92a4a15f5c279db0d5acac7bfd671fee3d569f57424a17cc4f57b4631c
MD5 89cf2146d59a15dd6b20851bb12f8cda
BLAKE2b-256 ee21fa2d59a69aab3fe23bc6f2f421815a60afa6ddc15ad49aea1966cfcc43e3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page