OpenAI plugin for vision agents

These details have not been verified by PyPI

Project links

Project description

OpenAI Plugin for Vision Agents

OpenAI LLM integration for Vision Agents framework with support for both standard and realtime interactions.

It enables features such as:

Real-time transcription and language processing using OpenAI models
Easy integration with other Vision Agents plugins and services
Function calling capabilities for dynamic interactions

Installation

pip install vision-agents[openai]

Usage

Standard LLM

This example shows how to use "gpt-4.1" model with TTS and STT services for audio communication via openai.LLM() API.

The openai.LLM() class uses OpenAI's Responses API under the hood.

To work with models via legacy Chat Completions API, see the Chat Completions models section.

from vision_agents.core import User, Agent
from vision_agents.core.agents import AgentLauncher
from vision_agents.plugins import deepgram, getstream, cartesia, smart_turn, openai

agent = Agent(
    edge=getstream.Edge(),
    agent_user=User(name="Friendly AI"),
    instructions="Be nice to the user",
    llm=openai.LLM("gpt-4.1"),
    tts=cartesia.TTS(),
    stt=deepgram.STT(),
    turn_detection=smart_turn.TurnDetection(),
)

Realtime LLM

Realtime audio and video communication is also supported via Realtime class. In this mode, the model handles audio and video processing directly without the need for TTS and STT services.

from vision_agents.core import User, Agent
from vision_agents.plugins import getstream, openai

agent = Agent(
    edge=getstream.Edge(),
    agent_user=User(name="Friendly AI"),
    instructions="Be nice to the user",
    llm=openai.Realtime(),
)

Chat Completions models

The openai.ChatCompletionsLLM and openai.ChatCompletionsVLM classes provide APIs for text and vision models that use the Chat Completions API.

They are compatible with popular inference backends such as vLLM, TGI, and Ollama.

For example, you can use them to interact with Qwen 3 VL visual model hosted on Baseten:

from vision_agents.core import User, Agent
from vision_agents.plugins import deepgram, getstream, elevenlabs, vogent, openai

# Instantiate the visual model wrapper
llm = openai.ChatCompletionsVLM(model="qwen3vl")
    
# Create an agent with video understanding capabilities
agent = Agent(
    edge=getstream.Edge(),
    agent_user=User(name="Video Assistant", id="agent"),
    instructions="You're a helpful video AI assistant. Analyze the video frames and respond to user questions about what you see.",
    llm=llm,
    stt=deepgram.STT(),
    tts=elevenlabs.TTS(),
    turn_detection=vogent.TurnDetection(),
    processors=[],
)

For full code, see examples/qwen_vl_example.

Function Calling

The LLM and Realtime APIs support function calling, allowing the assistant to invoke custom functions you define.

This enables dynamic interactions like:

Database queries
API calls to external services
File operations
Custom business logic

from vision_agents.plugins import openai

llm = openai.LLM("gpt-4.1")
# Or use openai.Realtime() for realtime model



@llm.register_function(
    name="get_weather",
    description="Get the current weather for a given city"
)
def get_weather(city: str) -> dict:
    """Get weather information for a city."""
    return {
        "city": city,
        "temperature": 72,
        "condition": "Sunny"
    }
# The function will be automatically called when the model decides to use it

Requirements

Python 3.10+
GetStream account for video calls
Open AI API key

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.5.0

Apr 1, 2026

0.4.7

Mar 27, 2026

0.4.6

Mar 26, 2026

0.4.5

Mar 25, 2026

0.4.4

Mar 23, 2026

0.4.3

Mar 11, 2026

0.4.2

Mar 10, 2026

0.4.1

Mar 4, 2026

0.4.0

Mar 3, 2026

0.3.8

Feb 24, 2026

0.3.7

Feb 23, 2026

0.3.6

Feb 13, 2026

0.3.5

Feb 10, 2026

This version

0.3.4

Feb 6, 2026

0.3.3

Feb 4, 2026

0.3.2

Jan 27, 2026

0.3.1

Jan 21, 2026

0.3.0

Jan 20, 2026

0.2.10

Jan 14, 2026

0.2.9

Jan 9, 2026

0.2.8

Jan 8, 2026

0.2.7

Jan 6, 2026

0.2.6

Dec 16, 2025

0.2.5

Dec 12, 2025

0.2.4

Dec 12, 2025

0.2.3

Dec 7, 2025

0.2.2

Nov 29, 2025

0.2.1

Nov 21, 2025

0.2.0

Nov 14, 2025

0.1.14

Nov 11, 2025

0.1.13

Nov 3, 2025

0.1.12

Oct 31, 2025

0.1.11

Oct 28, 2025

0.1.9

Oct 22, 2025

0.1.8

Oct 22, 2025

0.1.7

Oct 21, 2025

0.1.6

Oct 16, 2025

0.1.5

Oct 9, 2025

0.1.3

Oct 9, 2025

0.1.0

Oct 9, 2025

0.0.18

Oct 8, 2025

0.0.17

Oct 8, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vision_agents_plugins_openai-0.3.4.tar.gz (30.5 kB view details)

Uploaded Feb 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vision_agents_plugins_openai-0.3.4-py3-none-any.whl (45.3 kB view details)

Uploaded Feb 6, 2026 Python 3

File details

Details for the file vision_agents_plugins_openai-0.3.4.tar.gz.

File metadata

Download URL: vision_agents_plugins_openai-0.3.4.tar.gz
Upload date: Feb 6, 2026
Size: 30.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.20

File hashes

Hashes for vision_agents_plugins_openai-0.3.4.tar.gz
Algorithm	Hash digest
SHA256	`3b3bac46d91efbd1ea0b4e3f89d5511032886a976db138d4c8d33a0ce4c1be90`
MD5	`7c4f144c35de3a450a29365837bf7990`
BLAKE2b-256	`52abb3fd899042fafa0bc3481fb8e6fbc5731db41ae2f208a05a01fd96a05cf4`

See more details on using hashes here.

File details

Details for the file vision_agents_plugins_openai-0.3.4-py3-none-any.whl.

File metadata

Download URL: vision_agents_plugins_openai-0.3.4-py3-none-any.whl
Upload date: Feb 6, 2026
Size: 45.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.20

File hashes

Hashes for vision_agents_plugins_openai-0.3.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e23e7fe400c507561de2cda47ac648a200ce9c62269088442284c789a05a30bf`
MD5	`5a3282f036ae0eac92bacfb0324011a0`
BLAKE2b-256	`86f10b0fe04a0a41338277b2f5690c76cb07b2d51e6f0063b61f382ab9c64421`

See more details on using hashes here.

vision-agents-plugins-openai 0.3.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

OpenAI Plugin for Vision Agents

Installation

Usage

Standard LLM

Realtime LLM

Chat Completions models

Function Calling

Requirements

Links

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes