Skip to main content

An model agnostic extensible package that allows for AI & LLM interactions on a video stream

Project description

AI Stream Interact🧠🎞️ - LLM interaction capabilities on live USB camera video stream.

This package can easily be extended to accommodate different LLMs, but for this first version interactions were implemented only for Google's Gemini Pro & Vision Pro Models

Note: This is a basic Alpha version that's been written & tested in Ubuntu Linux only so it may have unexpected behavior with other operating systems.



Installation:

  • pip install ai-stream-interact

Note that pip install might take a while as it will also install coqui-ai for Text to Speech. Although TTS is partially implemented it is not turned on by default due to some glitchy behavior. (This will be fixed in future releases.)

Example Usage:

  1. You need a Gemini API key. (if you don't already have one you can get one here).
  2. Have a USB camera connected.
  3. run aisi --llm gemini to enter the AI Stream Interact🧠🎞️ main menu. (note that you can always go back to the main menu from the video stream by press "m" while having the video stream focused.)
  4. Enter the API key or press enter if you've added it to .env.
  5. You will be asked to enter your camera index. Currently there is no straight forward way to identify the exact index for your camera's name due to how open-cv enumerates such indicies so you'll have to just try a few times till you get the right one if you have multiple camers connected. If you have one camera connected you can try passing "-1" as in most cases it'll just pick that one.

Now you're in!. You have access to 3 types of interactions as of today.

Detect Default:

This fires up a window with your camera stream and whenever you press "d" will identify the object the camera is looking at. (Make sure to press "d" with the camera window focused and not your terminal).

Detect with Custom Prompt:

Use this to write up a custom prompt before showing the model an object for custom interactions beyond just identifying objects.

Interactions:

This just allows for back & forth chat with the model.

Troubleshooting:

Errors:

  • google.api_core.exceptions.FailedPrecondition: 400 User location is not supported for the API use.: This is specific to Gemini as they currently do not provide general availability to all regions, so you need to make sure your region is supported here

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_stream_interact-0.0.9.tar.gz (23.7 kB view hashes)

Uploaded Source

Built Distribution

ai_stream_interact-0.0.9-py3-none-any.whl (25.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page