Skip to main content

A structured pipeline for transforming video content into **searchable, metadata-rich, and SEO-optimized assets**, combining ingestion, transcription, OCR, NLP enrichment, and persistent storage.

Project description

Part of the Abstract Media Intelligence Platform

This module handles video ingestion and multimodal extraction within a unified media pipeline.

abstract_videos processes:

  • video download + metadata registry
  • transcription (Whisper) + frame OCR
  • NLP enrichment and structured storage

Full system: https://github.com/AbstractEndeavors/abstract-media-intelligence

abstract_videos — Video Processing & Media Intelligence Pipeline

A structured pipeline for transforming video content into searchable, metadata-rich, and SEO-optimized assets, combining ingestion, transcription, OCR, NLP enrichment, and persistent storage.

Designed for:

  • large-scale video ingestion
  • transcription and content extraction
  • media indexing and search
  • automated metadata generation and SEO

🔹 What This System Is

abstract_videos is not a downloader or transcription tool — it is a multi-stage media processing system:

  • ingests video from URLs or local sources
  • extracts audio, frames, and text
  • performs transcription (Whisper)
  • applies OCR to extracted frames
  • enriches content via NLP (keywords, summaries, titles)
  • persists structured results to database or filesystem

The system produces fully structured video representations usable for:

  • search
  • indexing
  • content generation
  • analytics

🔹 Pipeline Overview

Video Input (URL / File)
        ↓
Download + Registry (yt-dlp + metadata)
        ↓
Video Processing
    ├─ Conversion / normalization
    ├─ Audio extraction
    ├─ Frame extraction
        ↓
Content Extraction
    ├─ Transcription (Whisper)
    ├─ OCR on frames
        ↓
NLP Enrichment
    ├─ Summarization
    ├─ Keyword extraction
    ├─ Title generation
        ↓
Metadata Assembly
        ↓
Persistence Layer
    ├─ Database (JSONB structured storage)
    └─ Filesystem (artifacts + media)

Pipeline

flowchart TD
    A[Video URL / Local Video]
    B[VideoDownloader + Registry]
    C[Normalization / Conversion]
    D[Audio Extraction]
    E[Frame Extraction]
    F[Whisper Transcription]
    G[Frame OCR]
    H[NLP Enrichment\nSummary + Keywords + Title]
    I[Metadata Assembly\nCategory + Thumbnail + SEO]
    J[Persistence Layer\nFilesystem + DB / JSONB]
    K[Searchable / Structured Video Record]

    A --> B --> C
    C --> D --> F
    C --> E --> G
    F --> H
    G --> H
    H --> I --> J --> K

🔹 Core Capabilities

Video Ingestion & Registry

  • URL normalization and ID generation
  • Metadata extraction via yt-dlp
  • Persistent registry with atomic updates and locking

Processing Pipeline

  • Video normalization and format handling
  • Audio extraction for transcription
  • Frame extraction for visual analysis

Transcription & OCR

  • Whisper-based transcription pipeline
  • Frame-level OCR for embedded text
  • Combined multimodal text extraction

NLP & Metadata Enrichment

  • Keyword extraction and refinement
  • Title generation from summaries
  • Category inference based on content
  • Thumbnail selection via frame sharpness analysis

Structured Persistence

  • PostgreSQL storage with JSONB fields for:

    • raw info
    • metadata
    • transcripts
    • captions
    • thumbnails
    • aggregated outputs
  • Upsert-based lifecycle management for idempotent processing


🔹 Dual Pipeline Model (Key Concept)

The system supports two execution modes:

1. Local / Read-Write Pipeline

  • full processing on local machine
  • filesystem-based outputs
  • direct artifact generation

2. Database-Centric Pipeline

  • persistent storage as primary interface
  • JSONB-backed structured data
  • incremental updates and enrichment

🔹 Design Intent

Database + local modules as primary External ML (HuggingFace / APIs) as secondary

This enables:

  • offline-first operation
  • reproducibility
  • plug-and-play ML upgrades

🔹 Architecture

Core Components

  • VideoDownloader

    • ingestion + metadata acquisition
  • infoRegistry

    • centralized state + persistence
  • VideoTextPipeline

    • orchestrates processing stages
  • Metadata Console

    • post-processing and optimization (summaries, SEO)
  • Database Layer

    • structured storage with upsert semantics

🔹 Key Design Decisions

Idempotent Processing

  • all steps tracked via processed_steps
  • pipeline resumes without duplication
  • safe reprocessing of partial runs

Structured Over Raw

Everything is stored as structured JSON:

  • transcripts
  • keywords
  • metadata
  • derived content

Multimodal Extraction

Combines:

  • audio → text (transcription)
  • image → text (OCR)
  • text → meaning (NLP)

Registry as Source of Truth

  • central video registry
  • thread-safe and process-safe updates
  • ensures consistency across runs

🔹 Why This Exists

Most video pipelines:

  • stop at transcription
  • lack structure
  • are not searchable
  • are not reusable

abstract_videos transforms video into:

  • structured data
  • searchable content
  • SEO-ready metadata
  • indexable media assets

🔹 Example Use Cases

  • video → searchable content pipelines
  • media indexing platforms
  • transcription + analytics systems
  • SEO content generation
  • LLM-ready dataset creation

🔹 Integration Context

This system integrates directly with:

  • abstract_hugpy → NLP / summarization / keyword extraction
  • abstract_ocr → image/frame OCR
  • abstract_pdfs → document pipeline

🔹 Design Philosophy

  • Media is data, not just content
  • Structure enables reuse
  • Pipelines should be resumable and deterministic
  • Local-first, cloud-optional

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

abstract_videos-0.0.0.274.tar.gz (21.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

abstract_videos-0.0.0.274-py3-none-any.whl (21.8 kB view details)

Uploaded Python 3

File details

Details for the file abstract_videos-0.0.0.274.tar.gz.

File metadata

  • Download URL: abstract_videos-0.0.0.274.tar.gz
  • Upload date:
  • Size: 21.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for abstract_videos-0.0.0.274.tar.gz
Algorithm Hash digest
SHA256 0579c82e720b56e6bdbe807ba34cba6d68900f3b641a38295b1445c67fd465bb
MD5 2da1c34a3923da381d9c5e490ed5c39c
BLAKE2b-256 9657422533b225e8c2cffe6d673e318c7a7e8c683f5d06e938e2728ba6822b8a

See more details on using hashes here.

File details

Details for the file abstract_videos-0.0.0.274-py3-none-any.whl.

File metadata

File hashes

Hashes for abstract_videos-0.0.0.274-py3-none-any.whl
Algorithm Hash digest
SHA256 52b4efcae39cf92c23ce918b7455336c88ca480e05462a9e400a5b0c3923c8ef
MD5 a6270e5432780b1f17f0d99b8d4523a8
BLAKE2b-256 c0afc7916eff9c30157dc6d941757ca0afd0b3f10d38d75d279420dbc6e692a8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page