Skip to main content

Edge-optimized multimodal RAG framework for video understanding

Project description

VidChain: The "LangChain for Videos"

v0.8.3-Stable — Edge-optimized, local-first multimodal RAG framework for forensic video intelligence. Compose modular sensory nodes into custom pipelines, deploy as a microservice, or query via the Spider-Net Intelligence Portal.

Python CUDA License Status PyPI version

Spider-Net Intelligence Portal


Advanced Forensic Architecture

VidChain v0.8.3-Stable is powered by the B.A.B.U.R.A.O. Engine (Behavioral Analysis & Broadcasting Unit for Real-time Artificial Observation). It utilizes a modular "Nodes & Chains" framework to transform raw pixels into serialized forensic intelligence.

graph TD
    %% --- Ingestion Stage ---
    subgraph "1. Ingestion & Optimization Layer"
        VS[Video Source] --> AK[Adaptive Gaussian Filter]
        AK -- "Delta > Threshold" --> PK[Promote to Keyframe]
        AK -- "Redundant" --> DROP{{GPU Compute Firewall}}
    end

    %% --- Inference Stage ---
    subgraph "2. Sensory Node Matrix (Late Fusion)"
        PK --> VLM[LlavaNode: Scene Semantics]
        PK --> ASR[WhisperNode: Audio Trace]
        PK --> OCR[OcrNode: Digital Trace]
        PK --> TRK[TrackerNode: Motion Flow]
        
        %% Optional Sensors
        PK -.-> ACT[ActionNode: Situational Verbs]
        PK -.-> EMT[EmotionNode: Sentiment]
    end

    %% --- Intelligence Logic ---
    subgraph "3. B.A.B.U.R.A.O. Cognitive Engine"
        VLM & ASR & OCR & TRK & ACT & EMT --> FUSE[Semantic Fusion Pipeline]
        FUSE --> RDN[Recursive Map-Reduce Summarizer]
    end

    %% --- Persistence ---
    subgraph "4. Forensic Memory Vault"
        FUSE --> KV[(ChromaDB Vector Store)]
        FUSE --> KG[[Temporal Knowledge Graph]]
    end

    %% --- Interaction Stage ---
    subgraph "5. Spider-Net Intelligence Portal"
        USER[User Query] --> IR{Intent Router}
        IR -- "Forensic Search" --> RAG[RAG Retrieval Loop]
        IR -- "Executive Overview" --> RDN
        RAG <--> KV
        RAG <--> KG
        RDN --> REPORT([Intelligence Report])
        RAG --> DISCOVERY([Discovery Hub])
    end

    %% --- Hardware Loop ---
    HM[NVML Hardware Monitor] -.-> AK
    HM -.-> VLM
    HM -.-> DISCOVERY

    style VS fill:#1e1e2e,stroke:#74c7ec,stroke-width:2px;
    style DISCOVERY fill:#11111b,stroke:#a6e3a1,stroke-width:3px;
    style REPORT fill:#11111b,stroke:#a6e3a1,stroke-width:3px;
    style DROP fill:#313244,stroke-dasharray: 5 5;
    style AK fill:#1e1e2e,stroke:#fab387;

Key Features (v0.8.3 Evolution)

Relative Forensic Uplink [NEW]

The Spider-Net Portal now uses relative API paths. This ensures the suite works out-of-the-box whether accessed via localhost, industrial IPs, or local VPNs, resolving all "broken fetch" errors in production.

Composable Sensory Chains

Snap together modular nodes to build custom forensic pipelines. Optimized for Hardware Awareness, the system scales its inference depth based on live GPU/VRAM telemetry.

  • Adaptive Keyframe Firewall: Gaussian-blur differential filtering blocks identical frames, saving 70% of GPU compute in static scenes.
  • VLM-First Captions: Replaces blind tags with dense semantic descriptions ("Subject is hiding a silver object in their left pocket").

Spider-Net Intelligence Portal

A professional-grade forensic command center served natively via vidchain-serve.

  • Evidence Vault: surgical frame-by-frame seeking with 33ms precision.
  • Neural HUD: Real-time visualization of sensor activity and hardware stress.
  • Semantic Heatmap: Intelligence density mapping across the video timeline.

Automated Intelligence Reporting

The built-in Recursive Map-Reduce engine automatically iterates over forensic logs to generate high-fidelity executive summaries, complete with verified timestamps and entity relationship discovery.


Installation

# Core installation
pip install VidChain

# Setup local AI backends (Ollama)
ollama pull moondream   # Optimized Edge VLM (1.7GB)
ollama pull llama3      # Local Reasoning Hub (4.7GB)

# Verify Hardware Readiness (Bundled utility)
python -m vidchain.scripts.check_gpu

Developer API Recipes (Python)

VidChain is designed to be deeply extensible. Here are the core "Intelligence Recipes" for v0.8.3-Stable.

1. High-Fidelity Forensic Scan (Default)

Best for evidence reconstruction where detail matters more than speed.

from vidchain import VidChain, VideoChain
from vidchain.nodes import AdaptiveKeyframeNode, LlavaNode, WhisperNode, OcrNode

# Build the chain
chain = VideoChain(nodes=[
    AdaptiveKeyframeNode(change_threshold=5.0),
    LlavaNode(model_name="moondream"), 
    WhisperNode(),
    OcrNode()
])

vc = VidChain()
vid = vc.ingest("evidence.mp4", chain=chain)
print(vc.summarize_video(vid))

2. "CCTV Ultra-Fast" Scan (Low Latency)

Prioritize object detection speed over descriptive captioning.

from vidchain.nodes import YoloNode, TrackerNode

# Swap the VLM for a fast YOLOv8 tracker
fast_chain = VideoChain(nodes=[
    YoloNode(confidence=0.5), # Ultra-fast detection
    TrackerNode()             # Subject persistence
], frame_skip=30)             # 1 FPS skip for massive speedup

vc.ingest("cctv_feed.mp4", chain=fast_chain)

Research Position & Uniqueness

VidChain treats video as Serialized Sensor Logs, performing retrieval over structured multimodal narratives rather than raw pixel tokens. This significantly reduces hallucinations and enables multi-video GraphRAG reasoning.

See RESEARCH_COMPARISON.md for detailed SOTA benchmarks.


📜 Changelog (The v0.8.0 Milestone)

  • v0.8.3: Relative Path Migration. Fixed broken production fetches and asset routing.
  • v0.8.2: Migrated to official NVIDIA nvidia-ml-py bindings.
  • v0.8.1: Implemented Auto-Launch browser integration for vidchain-serve.
  • v0.8.0: The Modular Revolution. Deprecated monolithic processors for Node framework.

Author

Rahul Sharma — IIIT Manipur
SEM Project Version 0.8.3-Stable

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vidchain-0.8.3.tar.gz (546.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vidchain-0.8.3-py3-none-any.whl (570.3 kB view details)

Uploaded Python 3

File details

Details for the file vidchain-0.8.3.tar.gz.

File metadata

  • Download URL: vidchain-0.8.3.tar.gz
  • Upload date:
  • Size: 546.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.8

File hashes

Hashes for vidchain-0.8.3.tar.gz
Algorithm Hash digest
SHA256 19477bbf05bcbafd94297cec8406fc7ed1b5a32f97bfefc4970bb6aa113f73f8
MD5 713355e9e191626c239709a7505b510f
BLAKE2b-256 1b0b0ca58877890ace23ae01fc358f115e9c4ef90933b13a3e246afd908d3bfe

See more details on using hashes here.

File details

Details for the file vidchain-0.8.3-py3-none-any.whl.

File metadata

  • Download URL: vidchain-0.8.3-py3-none-any.whl
  • Upload date:
  • Size: 570.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.8

File hashes

Hashes for vidchain-0.8.3-py3-none-any.whl
Algorithm Hash digest
SHA256 a41a475be602cd26827feca38c51d3815b21dd6ba8e8528775d245a88b7ff440
MD5 73409babae454799f0fb1dc540426a92
BLAKE2b-256 53a41e0d54a3fe63789332b52256c8d1568b618549b4dd4be1f7405f4f21c3e6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page