Skip to main content

TwelveLabs Pegasus video understanding plugin for Vision Agents

Project description

TwelveLabs Plugin

This plugin brings TwelveLabs Pegasus video understanding to Vision Agents as a first-class VideoLLM.

Unlike frame-by-frame VLMs, Pegasus analyzes a short video clip, so it can reason about motion and events over time ("what just happened?") rather than a single still frame. Recent frames from the watched track are buffered, encoded into a short MP4 clip on demand, uploaded to the TwelveLabs Assets API, and analyzed with your prompt. The streamed answer is vocalized by the agent's TTS service.

Installation

uv add "vision-agents[twelvelabs]"
# or directly
uv add vision-agents-plugins-twelvelabs

You can grab a free API key at https://twelvelabs.io — there's a generous free tier.

Quick Start

import asyncio
import os
from dotenv import load_dotenv
from vision_agents.core import User, Agent, Runner
from vision_agents.core.agents import AgentLauncher
from vision_agents.plugins import deepgram, getstream, elevenlabs, twelvelabs
from vision_agents.plugins.getstream import CallSessionParticipantJoinedEvent

load_dotenv()


async def create_agent(**kwargs) -> Agent:
    llm = twelvelabs.PegasusVLM(
        api_key=os.getenv("TWELVELABS_API_KEY"),  # or set TWELVELABS_API_KEY
    )

    agent = Agent(
        edge=getstream.Edge(),
        agent_user=User(name="My happy AI friend", id="agent"),
        llm=llm,
        tts=elevenlabs.TTS(),
        stt=deepgram.STT(),
    )
    return agent


async def join_call(agent: Agent, call_type: str, call_id: str, **kwargs) -> None:
    call = await agent.create_call(call_type, call_id)

    @agent.events.subscribe
    async def on_participant_joined(event: CallSessionParticipantJoinedEvent):
        if event.participant.user.id != "agent":
            await asyncio.sleep(5)  # let a few seconds of video buffer
            await agent.simple_response("Describe what just happened in the video")

    async with agent.join(call):
        await agent.finish()


if __name__ == "__main__":
    Runner(AgentLauncher(create_agent=create_agent, join_call=join_call)).cli()

Configuration

PegasusVLM Parameters

  • api_key: str - TwelveLabs API key. If not provided, read from the TWELVELABS_API_KEY environment variable.
  • model_name: str - Pegasus model identifier (default: "pegasus1.5").
  • fps: float - Frame sampling rate for the buffered clip (default: 1.0).
  • clip_seconds: int - Length of the clip analyzed per request. Pegasus requires at least 4 seconds of video (default: 5).
  • max_tokens: int - Maximum tokens in the response. Pegasus requires at least 512 (default: 512).

Notes

  • Pegasus requires a minimum resolution of 360x360; lower-resolution frames are scaled up to that floor on encode.
  • Pegasus requires the analyzed clip to be at least 4 seconds long, so clip_seconds must be >= 4.
  • Each request uploads a clip and runs server-side analysis, so latency is higher than single-frame VLMs. Tune fps and clip_seconds for your use case.

Testing

# Unit tests (no API key needed)
uv run pytest plugins/twelvelabs/tests -m "not integration"

# Integration tests (needs TWELVELABS_API_KEY)
export TWELVELABS_API_KEY="your-key-here"
uv run pytest plugins/twelvelabs/tests -m integration

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vision_agents_plugins_twelvelabs-0.6.5.tar.gz (8.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file vision_agents_plugins_twelvelabs-0.6.5.tar.gz.

File metadata

  • Download URL: vision_agents_plugins_twelvelabs-0.6.5.tar.gz
  • Upload date:
  • Size: 8.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vision_agents_plugins_twelvelabs-0.6.5.tar.gz
Algorithm Hash digest
SHA256 f0179759b51fa0e614fdc44f011a27dfc62dd98cbaefffcf5e09ba6e401148b6
MD5 2f0b36d4a3010c43feef57f0ba3daa3f
BLAKE2b-256 09e0e7a7e8aa2c2f80c8724b923762381cd95d80d15d1f923c86a321859017a9

See more details on using hashes here.

File details

Details for the file vision_agents_plugins_twelvelabs-0.6.5-py3-none-any.whl.

File metadata

  • Download URL: vision_agents_plugins_twelvelabs-0.6.5-py3-none-any.whl
  • Upload date:
  • Size: 7.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vision_agents_plugins_twelvelabs-0.6.5-py3-none-any.whl
Algorithm Hash digest
SHA256 74da2fc0b3592871df1f3b915fb4ab771f416b7558f1bf1f9251c69f4515f989
MD5 27d9a74c1c3f8b39610fcacd7e71f60f
BLAKE2b-256 0775f9ef138911e64eea199ecafb24a07362d16ff8fd9cf719c6c160c440e457

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page