Index once, search many — agentic video memory CLI by Pixel ML

These details have not been verified by PyPI

Project description

av — Agentic Video Intelligence

Index. Search. Detect. Video intelligence toolkit for AI agents by Pixel ML.

pip install pixelml-av

What av Does

Video Memory — Ingest videos, search by natural language, ask questions with RAG citations.

Surveillance Intelligence — Detect falls, long queues, crowd gathering, and wheelchair compliance in CCTV footage using temporal reasoning.

Quick Start

Video Search

# 1. Set up your provider
av config setup

# 2. Ingest a video
av ingest video.mp4

# 3. Search
av search "person with red bag"

# 4. Ask questions
av ask "what happened at 2:30?"

Surveillance Detection

# Cloud (quick start — Gemini free tier)
export AV_API_KEY=your-gemini-key
av sentinel video.mp4

# Local (free, private — runs on your Mac/GPU)
ollama pull mistral-small3.2
av sentinel video.mp4 --provider ollama

# Specific alerts
av sentinel video.mp4 --alerts FALL,LONG_QUEUE

# Batch a directory
av sentinel videos/ --camera cam_lobby

All Commands

# Video memory
av ingest video.mp4             # Index video content
av search "what was discussed"  # Semantic search
av ask "key decisions?"         # RAG Q&A with citations
av list                         # List indexed videos
av transcript <id> --format vtt # Get transcript
av export --format jsonl        # Export all data
av export --format jsonl

# Surveillance intelligence
av sentinel video.mp4              # Detect events (all 4 alert types)
av sentinel video.mp4 --alerts FALL # Fall detection only
av sentinel video.mp4 -p ollama    # Self-hosted (free)
av sentinel videos/ -c cam_lobby   # Batch with camera tracking

Sentinel — Surveillance Event Detection

Detects 4 event types using temporal reasoning over VLM observations:

Alert	Detection	How It Works
FALL	Position tracking	`standing→lying` transition across frames (F1=0.944)
LONG_QUEUE	Temporal persistence	Queue detected in 3+ consecutive chunks (90s)
CROWD_GATHERING	Density + growth	Sustained crowd or rapid person count increase
WHEELCHAIR_COMPLIANCE	Service timing	Wheelchair user unattended > threshold

Providers for Sentinel

Provider	Setup	Cost	Speed
Gemini (cloud)	`export AV_API_KEY=key`	Free tier available	~5s/chunk
OpenRouter	`export OPENROUTER_API_KEY=key`	$0.04-0.14/1M tokens	~10s/chunk
Ollama (local)	`ollama pull mistral-small3.2`	Free	~25s/chunk
OpenAI	`export AV_API_KEY=key`	$$$	~5s/chunk

Auto-detection: if no provider specified, av tries Gemini → OpenRouter → ollama → OpenAI.

How It Works

Video → 30s chunks (5s overlap)
  → 8 frames per chunk
  → VLM perception (positions, queue, crowd, wheelchair)
  → Temporal agent (state across chunks)
  → Alert rules (transition detection, persistence, growth)
  → JSON output

Built on 107 experiments across 21 vision models. Key insight: structural extraction + temporal rules beats generic "detect anomalies" prompts.

Configuration

Interactive Setup (Recommended)

av config setup

Choose from four providers:

#	Provider	Auth	Transcription	Embeddings
1	OpenAI (Codex OAuth)	Auto-detected	Whisper	text-embedding-3-small
2	OpenAI (API key)	`sk-...` key	Whisper	text-embedding-3-small
3	Anthropic (Claude)	API key	Not supported	Not supported
4	Google (Gemini)	API key	Not supported	text-embedding-004

Config is saved to ~/.config/av/config.json and persists across sessions.

Note: Anthropic and Gemini don't support Whisper transcription. With these providers, use av ingest --captions for frame-based captioning, or set AV_OPENAI_API_KEY for transcription fallback.

Environment Variables

Env vars always override config.json:

export AV_API_KEY="sk-..."
export AV_API_BASE_URL="https://api.openai.com/v1"  # or any OpenAI-compatible endpoint
export AV_TRANSCRIBE_MODEL="whisper"
export AV_VISION_MODEL="gpt-4-1"
export AV_EMBED_MODEL="text-embedding-3-small"
export AV_CHAT_MODEL="gpt-4-1"

Requirements

Python 3.11+
FFmpeg (brew install ffmpeg)
An API key from OpenAI, Anthropic, or Google — or Codex CLI OAuth

Commands

Command	Description
`av config setup`	Interactive provider setup wizard
`av config show`	Show current configuration
`av ingest <path>`	Ingest video file(s) into the index
`av search <query>`	Full-text + semantic search
`av ask <question>`	RAG Q&A with citations
`av list`	List all indexed videos
`av info <video_id>`	Detailed video metadata
`av transcript <id>`	Output transcript (VTT/SRT/text)
`av export`	Export as JSONL/VTT/SRT
`av open <id> --at <sec>`	Open video at timestamp
`av version`	Print version JSON

License

Apache License 2.0 — see LICENSE for details.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Apr 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pixelml_av-0.1.0.tar.gz (156.7 kB view details)

Uploaded Apr 19, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pixelml_av-0.1.0-py3-none-any.whl (60.0 kB view details)

Uploaded Apr 19, 2026 Python 3

File details

Details for the file pixelml_av-0.1.0.tar.gz.

File metadata

Download URL: pixelml_av-0.1.0.tar.gz
Upload date: Apr 19, 2026
Size: 156.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pixelml_av-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`235b55a79b0a3494f3532ec673c2aa8e004ae4f52ef0c686af2696f8f41b0862`
MD5	`d61bd8b2c53396cbc32514b12b74052c`
BLAKE2b-256	`67e9f95a0d76889df076058fa383205493ccbd889043255b3f5e511392e80425`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pixelml_av-0.1.0.tar.gz:

Publisher: publish.yml on PixelML/av

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pixelml_av-0.1.0.tar.gz
- Subject digest: 235b55a79b0a3494f3532ec673c2aa8e004ae4f52ef0c686af2696f8f41b0862
- Sigstore transparency entry: 1340659634
- Sigstore integration time: Apr 19, 2026
Source repository:
- Permalink: PixelML/av@f0b9ef4bcdf192615341a309c0b07c9a2c7450ed
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/PixelML
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@f0b9ef4bcdf192615341a309c0b07c9a2c7450ed
- Trigger Event: release

File details

Details for the file pixelml_av-0.1.0-py3-none-any.whl.

File metadata

Download URL: pixelml_av-0.1.0-py3-none-any.whl
Upload date: Apr 19, 2026
Size: 60.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pixelml_av-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ee1e5dff4c99c5122803517071e557806583af1b82ca8bf59587d9e89d3f8c99`
MD5	`fb6fd60a0414212161508f67ec1ee827`
BLAKE2b-256	`b87616a9d17f7454814b28402146e0aef9cd1214bdf3bcee2f1f18e65a9b622f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pixelml_av-0.1.0-py3-none-any.whl:

Publisher: publish.yml on PixelML/av

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pixelml_av-0.1.0-py3-none-any.whl
- Subject digest: ee1e5dff4c99c5122803517071e557806583af1b82ca8bf59587d9e89d3f8c99
- Sigstore transparency entry: 1340659637
- Sigstore integration time: Apr 19, 2026
Source repository:
- Permalink: PixelML/av@f0b9ef4bcdf192615341a309c0b07c9a2c7450ed
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/PixelML
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@f0b9ef4bcdf192615341a309c0b07c9a2c7450ed
- Trigger Event: release

pixelml-av 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

av — Agentic Video Intelligence

What av Does

Quick Start

Video Search

Surveillance Detection

All Commands

Sentinel — Surveillance Event Detection

Providers for Sentinel

How It Works

Configuration

Interactive Setup (Recommended)

Environment Variables

Requirements

Commands

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance