Skip to main content

An model agnostic extensible package that allows for AI & LLM interactions on a video stream

Project description

AI Stream Interact🧠🎞️ - LLM interaction capabilities on live USB camera video stream.

This package can easily be extended to accommodate different LLMs, but for this first version interactions were implemented only for Google's Gemini Pro & Vision Pro Models

Note: This is a basic Alpha version that's been written & tested in Ubuntu Linux only so it may have unexpected behavior with other operating systems.



Installation:

  • pip install ai-stream-interact

Note that pip install might take a while as it will also install coqui-ai for Text to Speech. Although TTS is partially implemented it is not turned on by default due to some glitchy behavior. (This will be fixed in future releases.)

Example Usage:

  1. You need a Gemini API key. (if you don't already have one you can get one here).
  2. Have a USB camera connected.
  3. run aisi_gemini to enter the AI Stream Interact🧠🎞️ main menu. (note that you can always go back to the main menu from the video stream by press "m" while having the video stream focused.)
  4. Enter the API key or press enter if you've added it to .env.
  5. You will be asked to enter your camera index. Currently there is no straight forward way to identify the exact index for your camera's name due to how open-cv enumerates such indicies so you'll have to just try a few times till you get the right one if you have multiple camers connected. If you have one camera connected you can try passing "-1" as in most cases it'll just pick that one.

Now you're in!. You have access to 3 types of interactions as of today.

Detect Default:

This fires up a window with your camera stream and whenever you press "d" will identify the object the camera is looking at. (Make sure to press "d" with the camera window focused and not your terminal).

Detect with Custom Prompt:

Use this to write up a custom prompt before showing the model an object for custom interactions beyond just identifying objects.

Interactions:

This just allows for back & forth chat with the model.

Troubleshooting:

Errors:

  • google.api_core.exceptions.FailedPrecondition: 400 User location is not supported for the API use.: This is specific to Gemini as they currently do not provide general availability to all regions, so you need to make sure your region is supported here

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_stream_interact-0.0.5.tar.gz (23.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_stream_interact-0.0.5-py3-none-any.whl (29.9 kB view details)

Uploaded Python 3

File details

Details for the file ai_stream_interact-0.0.5.tar.gz.

File metadata

  • Download URL: ai_stream_interact-0.0.5.tar.gz
  • Upload date:
  • Size: 23.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for ai_stream_interact-0.0.5.tar.gz
Algorithm Hash digest
SHA256 1f134296728a99f551af468801cc1ed63e5d2052fe56465986cf2ffe20dd1e12
MD5 91bd374c33487adf7f81ceb6795be8e9
BLAKE2b-256 d87fee509cf78eec849765e34e1ec57f503921cd1ee84492fdee1433613c667a

See more details on using hashes here.

File details

Details for the file ai_stream_interact-0.0.5-py3-none-any.whl.

File metadata

File hashes

Hashes for ai_stream_interact-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 be068a3c6017ad9d35e678301fd3cbf145bb7ea0b1a0d31d7128475331aa170e
MD5 1cd332dc92ed5427447cb09eed72aec7
BLAKE2b-256 6fedc4cdf2b73238c42393d36a13a4ae5b098758d45ea04148fdad7781774531

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page