VidChain: High-fidelity multimodal RAG framework featuring the IRIS Intelligence Agent

These details have not been verified by PyPI

Project links

Project description

VidChain

High-Fidelity Multimodal RAG Framework for Forensic Video Intelligence

Python CUDA License Status

VidChain is a local-first multimodal RAG framework powered by the IRIS Engine (Intelligent Retrieval & Insight System). It parses video through a modular sensory matrix — fusing visual, auditory, OCR, and temporal signals into a queryable intelligence layer — designed for forensic analysis, security auditing, and automated video summarization with strict on-device privacy.

VidChain v1.0 Dashboard

Features

4-Route Agentic Router — Classifies queries into Narrative Summarization, Local Forensic Search, Global Master Intelligence, and Conversational Dialogue.
Global Master Intelligence — Cross-video entity tracking via a macro-graph, enabling pattern recognition across isolated sessions.
Temporal Persistence — Chronological reasoning that bridges frame gaps and maintains state continuity between sensor logs.
Recursive Map-Reduce Summarizer — Collapses hours of video into coherent reports without hitting LLM context limits.
Neural Concurrency Locking — Prevents state corruption during simultaneous ingestion and query operations.
100% Local Execution — All inference runs on host hardware; no data leaves the machine.

Installation

Prerequisites

Requirement	Version
Python	3.11+
CUDA	12.1+
Ollama	Latest (running)
Node.js	v18+ (for web portal)

Steps

1. Install PyTorch with CUDA support

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

2. Clone and install VidChain

git clone https://github.com/rahulsiiitm/videochain-python
cd videochain-python
pip install -e .

3. Pull model weights

ollama pull moondream   # Vision Language Model
ollama pull llama3      # Language Model for reasoning & routing

CPU Fallback: If no CUDA device is detected, VidChain automatically degrades to CPU mode — no code changes required.

Quick Start

from vidchain import VidChain

vc = VidChain(db_path="./forensic_vault")

# Ingest video (runs full default pipeline)
video_id = vc.ingest(video_source="interview_01.mp4")

# Query
response = vc.ask("What is the main topic of discussion?", video_id=video_id)
print(response["text"])

# Summarize
summary = vc.summarize_video(video_id=video_id, mode="concise")
print(summary)

CLI Reference

`vidchain-serve`

Launches the FastAPI backend and Next.js dashboard.

vidchain-serve

API available at http://localhost:8000
Dashboard opens at http://localhost:3000
Includes a 7-second neural warmup before accepting requests

`vidchain-analyze`

Headless video ingestion from the terminal.

vidchain-analyze path/to/video.mp4 --vlm moondream

Flag	Description
`--vlm <model>`	Vision model to use (default: `moondream`)
`--llm <model>`	Reasoning model to use (default: `gemini/gemini-2.5-flash`)
`--fast`	Replaces VLM with YOLO for high-speed detection (ideal for long CCTV footage)
`--emotion`	Injects DeepFace emotion analysis node
`--action`	Injects MobileNetV3 action classification node

Swapping models — VidChain uses LiteLLM, so any compatible model can be hot-swapped:

# Local
vidchain-analyze video.mp4 --llm "ollama/llama3"

# Cloud (requires API key export)
export GEMINI_API_KEY="your_api_key"
vidchain-analyze video.mp4 --llm "gemini/gemini-2.5-flash"

# Custom VLM
vidchain-analyze video.mp4 --vlm "llava:7b"

SDK: Modular Sensor Matrix

VidChain uses a LangChain-inspired composable pipeline. Each Node handles one sensing modality; chains are assembled per use case.

Available Nodes

Node	Modality	Description
`AdaptiveKeyframeNode`	Logic	Gaussian-differential sampling — drops redundant frames to reduce compute load
`LlavaNode`	Visual	Scene semantics, descriptive captions, and situational context
`YoloNode`	Visual	High-speed discrete object detection (lightweight fallback for `LlavaNode`)
`WhisperNode`	Audio	Speech transcription and acoustic anomaly detection (e.g., shouts)
`OcrNode`	Text	Digital trace extraction — license plates, screens, documents
`TrackerNode`	Motion	Persistent object tracking (IoU) and camera motion estimation (Optical Flow)
`EmotionNode`	Behavioral	Facial sentiment analysis
`ActionNode`	Behavioral	Human activity classification via MobileNetV3

Custom Pipeline Example

from vidchain import VidChain
from vidchain.pipeline import VideoChain
from vidchain.nodes import AdaptiveKeyframeNode, LlavaNode, OcrNode, TrackerNode

vc = VidChain(db_path="./forensic_vault")

surveillance_chain = VideoChain(nodes=[
    AdaptiveKeyframeNode(change_threshold=1.5),  # High sensitivity
    LlavaNode(model="moondream"),
    OcrNode(),
    TrackerNode()
])

video_id = vc.ingest(
    video_source="gate_camera_04.mp4",
    chain=surveillance_chain
)

response = vc.ask(
    "Were there any vehicles with visible license plates after 14:00?",
    video_id=video_id
)
print(response)

REST API

Exposed when running vidchain-serve.

Method	Endpoint	Description
`GET`	`/api/health`	System status and list of ingested video IDs
`POST`	`/api/sessions`	Create a new isolated neural session
`POST`	`/api/ingest`	Submit a video file path for background processing
`POST`	`/api/query`	Run a natural language query through the Agentic Router
`GET`	`/api/media-stream`	Serve local video securely for frontend playback

Architecture

Isolated GraphRAG

Each ingested video generates a dedicated Temporal Knowledge Graph (.pkl). The RAG engine retrieves semantically relevant chunks from ChromaDB and fuses them with structured graph data (co-occurrences, tracking IDs, timestamps). Memory boundaries are strictly enforced — no cross-video context bleed.

The Neural Lens

Every query response is paired with a Base64-encoded visual snapshot extracted directly from the referenced timestamp, providing visual proof for AI-generated claims.

License

MIT — See LICENSE for details.

Author: Rahul Sharma — IIIT Manipur

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.1

May 6, 2026

1.0.0

Apr 25, 2026

0.9.1

Apr 24, 2026

0.9.0

Apr 22, 2026

0.8.8

Apr 22, 2026

0.8.3

Apr 21, 2026

0.8.0

Apr 20, 2026

0.7.2

Apr 19, 2026

0.6.0

Apr 18, 2026

0.5.0

Apr 18, 2026

0.4.0

Apr 18, 2026

0.2.0

Apr 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vidchain-1.0.1.tar.gz (3.6 MB view details)

Uploaded May 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vidchain-1.0.1-py3-none-any.whl (3.6 MB view details)

Uploaded May 6, 2026 Python 3

File details

Details for the file vidchain-1.0.1.tar.gz.

File metadata

Download URL: vidchain-1.0.1.tar.gz
Upload date: May 6, 2026
Size: 3.6 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.8

File hashes

Hashes for vidchain-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`ea8409de26eddef03efc61fb853575d956c103dcd83f70a953cc335e56c4a936`
MD5	`e79ef690f09a9acc6c8726afdf10697f`
BLAKE2b-256	`e2e12591a9c92f33e05d382dd9b0f61119091c9ea675c349b922f53833892a2d`

See more details on using hashes here.

File details

Details for the file vidchain-1.0.1-py3-none-any.whl.

File metadata

Download URL: vidchain-1.0.1-py3-none-any.whl
Upload date: May 6, 2026
Size: 3.6 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.8

File hashes

Hashes for vidchain-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b03e2f9c30a482cacac7b7ef04bffeb9dc563f069ab8cc1696f48487818398cf`
MD5	`92e10c5b985ca272a9f9b4b84426eb45`
BLAKE2b-256	`f659f0ab2850fdc4428182a38bccabf33dca4b3d0821f16f15691dcb06ea826d`

See more details on using hashes here.

vidchain 1.0.1

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

VidChain

Features

Installation

Prerequisites

Steps

Quick Start

CLI Reference

vidchain-serve

vidchain-analyze

SDK: Modular Sensor Matrix

Available Nodes

Custom Pipeline Example

REST API

Architecture

Isolated GraphRAG

The Neural Lens

License

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`vidchain-serve`

`vidchain-analyze`