Percept Vision — store videos in object storage + their CV understanding in Redis, then search and ask questions about them. MCP plugin.

These details have not been verified by PyPI

Project description

Percept Vision 🎥 — store videos in Redis, then ask questions about them (MCP)

The MP4 bytes live in object storage. The understanding lives in Redis — and that's what you query.

pip install percept-vision-plugin gives any agent the ability to ingest a video, run computer vision over it (OpenCV frame sampling + YOLO object detection + CLIP embeddings), and store the result in Redis — so you can find the exact moment something happens, list what objects appear, and ask natural-language questions about your video library. Exposed over the Model Context Protocol (MCP).

It's the multimodal layer of Percept Context — video becomes first-class, searchable nodes alongside your context graph, all in one Redis.

Why this exists

Redis's vector stack is text-only — every shipping RedisVL vectorizer is a …Text… class; there's no image/video vectorizer, no frame-level search, no media model. And per Redis's own guidance, you shouldn't put MP4 bytes in Redis. Percept Vision does it the right way:

Concern	How
Video bytes	Supabase Storage → a public URL (Redis never holds the file)
Frame understanding	OpenCV samples frames; YOLO detects objects; CLIP embeds them
Searchable index	RedisVL vector index of CLIP frame vectors (`percept_frames`)
"Find the moment"	CLIP text→image search returns timestamped, deep-linkable moments
"What's in it"	YOLO detections aggregated per video
"Ask about it"	retrieved frames + detections → optional LLM answer with timestamps

Pipeline

  ingest_video(path|url)
        │
        ├─►  Supabase Storage  ──►  public MP4 URL ─────────────┐
        │                                                       │
        └─►  OpenCV sample frames                               │
                 ├─►  YOLO  ──►  objects per frame              ▼
                 └─►  CLIP  ──►  512-d vector ──►  R E D I S  (percept_frames index)
                                                   pv:video:{id}  +  pv:frame:{id}
        ┌───────────────────────────────────────────────────────────────────────┐
   search_moments("a person holding a phone")  → CLIP text vector → top frames
   ask_video("what happens at the end?")       → frames + detections → answer
   list_video_objects(id)                      → {person: 12, laptop: 4, ...}

Install

Requires Python ≥ 3.10, a Redis with the Search module (Redis Stack / Redis 8 / Redis Cloud), and a Supabase project. First run downloads YOLOv8n (~6 MB) and CLIP ViT-B-32 (~600 MB).

pip install percept-vision-plugin

Configure (.env or env vars):

SUPABASE_URL=https://YOUR_REF.supabase.co
SUPABASE_SERVICE_KEY=your_service_role_key   # bypasses RLS for uploads
SUPABASE_BUCKET=percept-videos               # public bucket
REDIS_URL=redis://localhost:6379
REDIS_PROTOCOL=2

Try it:

python examples/quickstart.py /path/to/video.mp4

Register with Claude Code

claude mcp add percept-vision --scope local \
  --env SUPABASE_URL=... --env SUPABASE_SERVICE_KEY=... --env SUPABASE_BUCKET=percept-videos \
  --env REDIS_URL=redis://localhost:6379 --env REDIS_PROTOCOL=2 \
  -- percept-vision

Tools

Tool	What it does
`ingest_video(source, video_id?)`	Upload + CV-analyze a video (path or URL) into Redis.
`search_moments(query, k?, video_id?)`	Find moments matching a phrase → timestamps + deep links + objects.
`ask_video(question, video_id?, k?)`	Q&A over retrieved frames (LLM answer if `ANTHROPIC_API_KEY` set).
`list_video_objects(video_id)`	YOLO object counts for a video.
`list_videos()`	All ingested videos.
`vision_stats()`	Redis health + video count.

Configuration reference

Env var	Default	Purpose
`SUPABASE_URL` / `SUPABASE_SERVICE_KEY`	– (required)	Object storage for MP4s.
`SUPABASE_BUCKET`	`percept-videos`	Public bucket name.
`REDIS_URL` / `REDIS_PROTOCOL`	`redis://localhost:6379` / `2`	Redis with Search.
`PERCEPT_YOLO_MODEL`	`yolov8n.pt`	Ultralytics model.
`PERCEPT_CLIP_MODEL`	`clip-ViT-B-32`	CLIP model (image+text).
`PERCEPT_FRAME_INTERVAL`	`1.0`	Seconds between sampled frames.
`PERCEPT_MAX_FRAMES`	`60`	Cap on frames per video.
`ANTHROPIC_API_KEY`	–	Enables written answers in `ask_video`.

License

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Jun 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

percept_vision_plugin-0.1.0.tar.gz (11.7 kB view details)

Uploaded Jun 21, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

percept_vision_plugin-0.1.0-py3-none-any.whl (14.6 kB view details)

Uploaded Jun 21, 2026 Python 3

File details

Details for the file percept_vision_plugin-0.1.0.tar.gz.

File metadata

Download URL: percept_vision_plugin-0.1.0.tar.gz
Upload date: Jun 21, 2026
Size: 11.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for percept_vision_plugin-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`8c2a11f7943dd8d529f98b41421096bb415fbdff5d4845f163c8c1f5af6da477`
MD5	`068ee0f28a40d11e17d650ac3093f568`
BLAKE2b-256	`93e7622ecd17ed5f729a0f38bf8836d9a9c08568d2b96806255adbdeaa7920f6`

See more details on using hashes here.

File details

Details for the file percept_vision_plugin-0.1.0-py3-none-any.whl.

File metadata

Download URL: percept_vision_plugin-0.1.0-py3-none-any.whl
Upload date: Jun 21, 2026
Size: 14.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for percept_vision_plugin-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`230661018add57f1f086c53b0fc98b273ecf2e368749c60482515c02eab89a73`
MD5	`7a5a14c019a2fa6c5ed53eebe960a2c4`
BLAKE2b-256	`3c6594c768644bc82776bf9722d9de44f1c239eaf5fb91a2283f8d47e09ac3c5`

See more details on using hashes here.

percept-vision-plugin 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Percept Vision 🎥 — store videos in Redis, then ask questions about them (MCP)

Why this exists

Pipeline

Install

Register with Claude Code

Tools

Configuration reference

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes