Skip to main content

LlamaIndex x OpenAI Realtime integration

Project description

LlamaIndex x OpenAI Realtime Integration

This package is an integration for the OpenAI Realtime Conversation service.

To install the package, run:

python3 -m pip install llama-index-voice-agents-openai

And, if you want to run it, you can refer to the simple example down here (in this case, the audio input/output are the same as the local device you are running the script on):

import os
import signal
import time
import logging

from typing import List
from dotenv import load_dotenv
from llama_index.voice_agents.openai import OpenAIVoiceAgent
from llama_index.core.voice_agents import BaseVoiceAgentEvent
from llama_index.core.llms import ChatMessage, TextBlock
from llama_index.core.tools import FunctionTool

logging.basicConfig(
    level=logging.INFO, format="%(asctime)s [%(levelname)s] %(message)s"
)

# Load environment variables from a .env file
load_dotenv()

quitFlag = False


# use filter functions to export messages and events without your terminal being swamped by base64-encoded audio bytes :)
def filter_events(
    events: List[BaseVoiceAgentEvent],
) -> List[BaseVoiceAgentEvent]:
    evs = []
    for event in events:
        if "audio" in event.type_t and "transcript" not in event.type_t:
            if "delta" in event.type_t:
                event.delta = ""
                evs.append(event)
        else:
            evs.append(event)
    return evs


def filter_messages(messages: List[ChatMessage]) -> List[ChatMessage]:
    msgs = []
    for message in messages:
        msg = ChatMessage(role=message.role, blocks=[])
        for b in message.blocks:
            if isinstance(b, TextBlock):
                msg.blocks.append(b)
        if len(msg.blocks) > 0:
            msgs.append(msg)
    return msgs


def signal_handler(sig, frame):
    """Handle Ctrl+C and initiate graceful shutdown."""
    global quitFlag
    quitFlag = True


async def main():
    # this is just a mock tool
    def get_weather(location: str):
        """
        Get the weather for a location
        """
        return f"The weather in {location} is sunny, there are 32°C, wind is 3 km/h toward west, and humidity is at 34%."

    api_key = os.getenv("OPENAI_API_KEY")

    if not api_key:
        return
    tools = [
        FunctionTool.from_defaults(
            fn=get_weather,
            name="get_weather",
            description="Get the current weather...",
        )
    ]
    # use custom models and tools!
    conversation = OpenAIVoiceAgent(
        api_key=api_key,
        tools=tools,
        model="gpt-4o-mini-realtime-preview-2024-12-17",
    )

    signal.signal(
        signal.SIGINT,
        lambda sig, frame: signal_handler(sig, frame),
    )

    try:
        # customize the model with instructions and other arguments you can find here: https://platform.openai.com/docs/api-reference/realtime-client-events/session/update
        await conversation.start(
            instructions="You are a very helpful assistant. You can use the 'get_weather' tool to get weather information about a location: you just have to input the location to the tool. Please execute the tool whenever you are asked about the weather of a location.",
        )
        while not quitFlag:
            time.sleep(0.1)
        if quitFlag:
            await conversation.interrupt()
            await conversation.stop()

    except Exception as e:
        logging.error(f"Error in main loop: {e}")
        await conversation.interrupt()
        await conversation.stop()

    finally:
        logging.info("Exiting main.")
        print("Messages:")
        print(conversation.export_messages(filter=filter_messages))
        print("Events:")
        print(conversation.export_events(filter=filter_events))
        await conversation.stop()  # Ensures cleanup if any error occurs


if __name__ == "__main__":
    import asyncio

    asyncio.run(main())
    # remember that YOU have to start the conversation!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_voice_agents_openai-0.1.1b0.tar.gz (10.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file llama_index_voice_agents_openai-0.1.1b0.tar.gz.

File metadata

File hashes

Hashes for llama_index_voice_agents_openai-0.1.1b0.tar.gz
Algorithm Hash digest
SHA256 0719bc876626cc3d3e3e3c77a6db3d91f6700e84664ca9c8840bbae256f0e9fc
MD5 39e32e32ba88cca7a0ae435a2ca8d661
BLAKE2b-256 e604243300d9c260ed50dbc428a438c4e89d894519d7389b84185dcd8cca3c43

See more details on using hashes here.

File details

Details for the file llama_index_voice_agents_openai-0.1.1b0-py3-none-any.whl.

File metadata

File hashes

Hashes for llama_index_voice_agents_openai-0.1.1b0-py3-none-any.whl
Algorithm Hash digest
SHA256 738080db79a81537e47fdb24db844a9a7894f34796a26ed1c461c5c53d210b07
MD5 775eb597e8d50892335c518556abdb2a
BLAKE2b-256 6bfd8d7d75c07cebd25620bdb786491b0831a12aa32ad5cdf59a9503fd1d7279

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page