Intelligent video keyframe extraction for VLMs

These details have not been verified by PyPI

Project links

Project description

KeyFrame Scout

[Python Version](https://www.python.org/downloads/) [License](LICENSE) [Version](https://github.com/yourusername/keyframe-scout)

An intelligent video keyframe extraction tool optimized for Vision Language Models (VLMs) and video analysis. Extract meaningful frames from videos using adaptive algorithms, with direct support for Azure OpenAI GPT and other VLMs.

✨ Key Features

🎯 Intelligent Frame Selection: Three extraction modes (adaptive, interval, fixed) to suit different use cases
🤖 VLM-Ready: Direct integration with Azure OpenAI GPT and other vision language models
📦 Base64 Support: Return frames as base64 strings for immediate API usage
⚡ Batch Processing: Process multiple videos efficiently with parallel execution
🎨 Flexible Output: Save as files, return as base64, or both
📊 Smart Analysis: Automatically identifies scene changes and important moments
🔧 Easy Integration: Simple Python API and command-line interface

🚀 What's New in v0.2.2

Base64 Encoding: Direct base64 output for VLM integration
Azure OpenAI Support: Built-in integration for GPT
VLM Utilities: Helper functions for preparing frames for various VLMs
Batch Processing: Process entire directories of videos
Enhanced API: More flexible configuration options

📦 Installation

Using pip

pip install keyframe-scout

From source

git clone https://github.com/yourusername/keyframe-scout.git
cd keyframe-scout
pip install -e .

Dependencies

Python 3.7+
OpenCV (cv2)
NumPy
Pillow
FFmpeg (system dependency)

Install FFmpeg:

# Ubuntu/Debian
sudo apt update && sudo apt install ffmpeg

# macOS
brew install ffmpeg

# Windows
# Download from https://ffmpeg.org/download.html

🎯 Quick Start

Basic Usage

import keyframe_scout as ks

# Extract keyframes from a video
result = ks.extract_video_keyframes({
    'video': 'path/to/video.mp4',
    'output_dir': 'output/frames',
    'nframes': 10
})

print(f"Extracted {result['extracted_frames']} frames")

VLM Integration (New!)

import keyframe_scout as ks
from openai import AzureOpenAI

# Extract frames for GPT
frames = ks.extract_frames_for_vlm(
    'video.mp4',
    max_frames=8,
    max_size=1024
)

# Prepare messages for Azure OpenAI
messages = ks.create_video_messages(
    'video.mp4',
    prompt="What's happening in this video?",
    max_frames=8
)

# Use with Azure OpenAI
client = AzureOpenAI(
    azure_endpoint="your-endpoint",
    api_key="your-key",
    api_version="2024-02-15-preview"
)

response = client.chat.completions.create(
    model="gpt-4-vision-preview",
    messages=messages,
    max_tokens=500
)

print(response.choices[0].message.content)

Base64 Output (New!)

# Get frames as base64 strings (no files saved)
result = ks.extract_video_keyframes({
    'video': 'video.mp4',
    'nframes': 5,
    'return_base64': True,
    'max_size': 1024
})

# Access base64 data
for frame in result['frames']:
    print(f"Frame at {frame['timestamp']}s")
    base64_data = frame['base64']
    # Use base64_data with your VLM API

📖 Detailed Usage

Extraction Modes

1. Adaptive Mode (Default)

Intelligently selects the most representative frames based on content analysis.

result = ks.extract_video_keyframes({
    'video': 'video.mp4',
    'output_dir': 'output',
    'mode': 'adaptive',
    'nframes': 10
})

2. Interval Mode

Extracts frames at fixed time intervals.

result = ks.extract_video_keyframes({
    'video': 'video.mp4',
    'output_dir': 'output',
    'mode': 'interval',
    'interval': 5.0,  # Every 5 seconds
    'frames_per_interval': 1
})

3. Fixed Mode

Extracts a fixed number of evenly distributed frames.

result = ks.extract_video_keyframes({
    'video': 'video.mp4',
    'output_dir': 'output',
    'mode': 'fixed',
    'frames_per_interval': 20  # Total 20 frames
})

VLM Integration Examples

Using the VideoAnalyzer Class

# Initialize analyzer
analyzer = ks.VideoAnalyzer(
    azure_endpoint="your-endpoint",
    api_key="your-key"
)

# Analyze video
result = analyzer.analyze_video(
    'video.mp4',
    prompt="Describe the main events in this video",
    max_frames=10
)

print(result)

Batch Video Analysis

# Analyze multiple videos
videos = ['video1.mp4', 'video2.mp4', 'video3.mp4']
prompts = ['What happens?', 'Who appears?', 'Where is this?']

results = analyzer.batch_analyze(videos, prompts, max_frames=8)

Custom VLM Integration

# Get frames for any VLM
frames = ks.extract_frames_for_vlm('video.mp4', max_frames=6)

# Prepare for your VLM API
for i, frame in enumerate(frames):
    image_data = {
        'base64': frame['base64'],
        'timestamp': frame['timestamp'],
        'description': f'Frame {i+1}'
    }
    # Send to your VLM API

Batch Processing

# Process all videos in a directory
results = ks.process_video_directory(
    directory='videos/',
    output_dir='output/',
    extensions=['.mp4', '.avi'],
    recursive=True,
    config_template={
        'mode': 'adaptive',
        'nframes': 10,
        'return_base64': True
    }
)

# Or process a list of videos
video_list = ['video1.mp4', 'video2.mp4', 'video3.mp4']
results = ks.extract_keyframes_batch(
    video_list,
    output_base_dir='batch_output/',
    max_workers=4
)

Advanced Configuration

config = {
    'video': 'video.mp4',
    'output_dir': 'output',
    'mode': 'adaptive',
    'nframes': 10,
    
    # Resolution options
    'resolution': '720p',  # '360p', '480p', '720p', '1080p', 'original'
    
    # Image options
    'image_format': 'jpg',  # 'jpg' or 'png'
    'image_quality': 95,    # 1-100 for JPEG
    
    # Base64 options (new)
    'return_base64': True,
    'include_files': False,  # Don't save files when using base64
    'max_size': 1024,       # Max dimension for base64 images
    
    # Analysis parameters
    'sample_rate': 30,      # Analyze every Nth frame
    'min_frames': 5,        # Minimum frames to extract
    'max_frames': 20        # Maximum frames to extract
}

result = ks.extract_video_keyframes(config)

🔧 Command Line Interface

Basic usage

# Extract 10 keyframes
keyframe-scout video.mp4 -o output_frames --nframes 10

# Use specific mode
keyframe-scout video.mp4 -o output_frames --mode interval --interval 5

# Set resolution and quality
keyframe-scout video.mp4 -o output_frames --resolution 720p --quality 90

Batch processing

# Process directory
keyframe-scout-batch videos/ -o batch_output/ --recursive

# With custom settings
keyframe-scout-batch videos/ -o batch_output/ --nframes 8 --resolution 480p

📊 API Reference

Core Functions

`extract_video_keyframes(config)`

Main extraction function with full configuration options.

`extract_frames_for_vlm(video_path, max_frames, max_size, mode)`

Extract frames optimized for VLM usage, returns base64 encoded frames.

`create_video_messages(video_path, prompt, max_frames, system_prompt)`

Create messages formatted for Azure OpenAI GPT.

`get_video_info(video_path)`

Get video metadata (duration, resolution, fps, etc).

VLM Utilities

`prepare_for_azure_openai(video_path, max_frames, detail)`

Prepare frames in Azure OpenAI format with detail level control.

`estimate_token_usage(frames, detail)`

Estimate token usage for GPT API calls.

`save_base64_frames(frames, output_dir, prefix)`

Save base64 encoded frames to files.

🎨 Examples

Video Summary for Blog

import keyframe_scout as ks

# Extract key moments from a video
frames = ks.extract_frames_for_vlm('tutorial.mp4', max_frames=6)

# Generate descriptions using GPT
analyzer = ks.VideoAnalyzer()
for i, frame in enumerate(frames):
    description = analyzer.analyze_video(
        'tutorial.mp4',
        f"Describe what's shown at {frame['timestamp']} seconds",
        max_frames=1
    )
    print(f"Time {frame['timestamp']}s: {description}")

Video Content Moderation

# Check video content
messages = ks.create_video_messages(
    'uploaded_video.mp4',
    prompt="Does this video contain any inappropriate content? List any concerns.",
    max_frames=10,
    system_prompt="You are a content moderation assistant."
)

# Send to your moderation API

Creating Video Thumbnails

# Extract best frames for thumbnails
result = ks.extract_video_keyframes({
    'video': 'video.mp4',
    'output_dir': 'thumbnails',
    'mode': 'adaptive',
    'nframes': 5,
    'resolution': '720p',
    'image_quality': 95
})

# The frames are automatically selected for maximum visual interest

🐛 Troubleshooting

FFmpeg not found

# Install FFmpeg
sudo apt install ffmpeg  # Ubuntu/Debian
brew install ffmpeg      # macOS

Import errors

# Install all dependencies
pip install keyframe-scout[all]

GPU acceleration

# OpenCV will automatically use GPU if available
# Check GPU availability
import cv2
print(cv2.cuda.getCudaEnabledDeviceCount())

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

# Development setup
git clone https://github.com/yourusername/keyframe-scout.git
cd keyframe-scout
pip install -e ".[dev]"

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

OpenCV community for the excellent computer vision library
FFmpeg project for video processing capabilities
Inspired by video analysis needs in the VLM era

📮 Contact

GitHub Issues: https://github.com/yourusername/keyframe-scout/issues
Email: your.email@example.com

Made with ❤️ for the VLM community

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.4

Jun 11, 2025

This version

0.2.3

Jun 11, 2025

0.2.2

Jun 11, 2025

0.2.1

Jun 11, 2025

0.2.0

Jun 10, 2025

0.1.0

Jun 10, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

keyframe_scout-0.2.3.tar.gz (24.7 kB view details)

Uploaded Jun 11, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

keyframe_scout-0.2.3-py3-none-any.whl (24.7 kB view details)

Uploaded Jun 11, 2025 Python 3

File details

Details for the file keyframe_scout-0.2.3.tar.gz.

File metadata

Download URL: keyframe_scout-0.2.3.tar.gz
Upload date: Jun 11, 2025
Size: 24.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for keyframe_scout-0.2.3.tar.gz
Algorithm	Hash digest
SHA256	`f7020afdbf6eb1fdb7a0046a72df570a300bb981e0764a4f5652b93a9987d706`
MD5	`cb52914d175af52bcd6c645ef7ee85a4`
BLAKE2b-256	`c1085455de59a6a9b203df8e38ed8c60062d999a83e94b18e09eb107a9e23bbc`

See more details on using hashes here.

File details

Details for the file keyframe_scout-0.2.3-py3-none-any.whl.

File metadata

Download URL: keyframe_scout-0.2.3-py3-none-any.whl
Upload date: Jun 11, 2025
Size: 24.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for keyframe_scout-0.2.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`900eb47a1135807e0eb4c481b2505d06ac842acd6793e029ac1fd71f7af43a03`
MD5	`32feaed1e97f45056f909ea7cbab2ea3`
BLAKE2b-256	`c38019d2de63447d9fd7e435afd0f035492574d827cc03b445b53c1532ddde3d`

See more details on using hashes here.

keyframe-scout 0.2.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

KeyFrame Scout

✨ Key Features

🚀 What's New in v0.2.2

📦 Installation

Using pip

From source

Dependencies

🎯 Quick Start

Basic Usage

VLM Integration (New!)

Base64 Output (New!)

📖 Detailed Usage

Extraction Modes

1. Adaptive Mode (Default)

2. Interval Mode

3. Fixed Mode

VLM Integration Examples

Using the VideoAnalyzer Class

Batch Video Analysis

Custom VLM Integration

Batch Processing

Advanced Configuration

🔧 Command Line Interface

Basic usage

Batch processing

📊 API Reference

Core Functions

extract_video_keyframes(config)

extract_frames_for_vlm(video_path, max_frames, max_size, mode)

create_video_messages(video_path, prompt, max_frames, system_prompt)

get_video_info(video_path)

VLM Utilities

prepare_for_azure_openai(video_path, max_frames, detail)

estimate_token_usage(frames, detail)

save_base64_frames(frames, output_dir, prefix)

🎨 Examples

Video Summary for Blog

Video Content Moderation

Creating Video Thumbnails

🐛 Troubleshooting

FFmpeg not found

Import errors

GPU acceleration

🤝 Contributing

📄 License

🙏 Acknowledgments

📮 Contact

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`extract_video_keyframes(config)`

`extract_frames_for_vlm(video_path, max_frames, max_size, mode)`

`create_video_messages(video_path, prompt, max_frames, system_prompt)`

`get_video_info(video_path)`

`prepare_for_azure_openai(video_path, max_frames, detail)`

`estimate_token_usage(frames, detail)`

`save_base64_frames(frames, output_dir, prefix)`