Skip to main content

Extract and categorize high-quality frames containing people in specific poses from video files

Project description

Person From Vid

PyPI version Python versions License: GPL-3.0-or-later

AI-powered video frame extraction and pose categorization tool that analyzes video files to identify and extract high-quality frames containing people in specific poses and head orientations.

Features

  • ๐ŸŽฅ Video Analysis: Supports multiple video formats (MP4, AVI, MOV, MKV, WebM, etc.).
  • ๐Ÿค– AI-Powered Detection: Uses state-of-the-art models for face detection (yolov8s-face), pose estimation (yolov8s-pose), and head pose analysis (sixdrepnet).
  • ๐Ÿง  Smart Frame Selection:
    • Keyframe Detection: Prioritizes information-rich I-frames.
    • Temporal Sampling: Extracts frames at regular intervals to ensure coverage.
    • Deduplication: Avoids saving visually similar frames.
  • ๐Ÿ“ Pose & Shot Classification:
    • Automatically categorizes poses into standing, sitting, and squatting.
    • Classifies shot types like closeup, medium shot, and full body.
  • ๐Ÿ‘ค Head Orientation: Classifies head directions into 9 cardinal orientations (front, profile, looking up/down, etc.).
  • ๐Ÿ–ผ๏ธ Advanced Quality Assessment: Uses multiple metrics like blur, brightness, and contrast to select the sharpest, best-lit frames.
  • โšก GPU Acceleration: Optional CUDA/MPS support for significantly faster processing.
  • ๐Ÿ“Š Rich Progress Tracking: Modern console interface with real-time progress displays and detailed status.
  • ๐Ÿ”„ Resumable Processing: Automatically saves progress and allows resuming interrupted sessions.
  • โš™๏ธ Highly Configurable: Extensive configuration options via CLI, YAML files, or environment variables.

Installation

Prerequisites

  • Python 3.10 or higher
  • FFmpeg (for video processing)

Installing FFmpeg

macOS:

brew install ffmpeg

Ubuntu/Debian:

sudo apt update
sudo apt install ffmpeg

Windows: Download from FFmpeg official website or use:

choco install ffmpeg  # Using Chocolatey

Install Person From Vid

From PyPI

The recommended way to install is via pip:

pip install personfromvid

From Source

Alternatively, to install from source:

git clone https://github.com/personfromvid/personfromvid.git
cd personfromvid
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -e .

Quick Start

Basic Usage

# Process a video file, saving results to the same directory
personfromvid video.mp4

# Specify a different output directory
personfromvid video.mp4 --output-dir ./extracted_frames

# Enable verbose logging for detailed information
personfromvid video.mp4 --verbose

# Use GPU for faster processing (if available)
personfromvid video.mp4 --device gpu

Advanced Usage

# High-quality processing with custom settings
personfromvid video.mp4 \
    --output-dir ./custom_output \
    --output-jpeg-quality 98 \
    --confidence 0.5 \
    --batch-size 16 \
    --max-frames 1000

# Resize output images to a maximum of 1024 pixels
personfromvid video.mp4 --resize 1024

# Resume an interrupted process
personfromvid video.mp4 --resume

Command-line Options

personfromvid offers many options to customize its behavior. Here are some of the most common ones:

Option Alias Description Default
--config -c Path to a YAML configuration file. None
--output-dir -o Directory to save output files. Video's directory
--device Device to use for AI models (auto, cpu, gpu). auto
--log-level -l Set logging level (DEBUG, INFO, WARNING, ERROR). INFO
--verbose -v Enable verbose output (sets log level to DEBUG). False
--quiet -q Suppress non-essential output. False
--resume Resume from the last saved state. True
--batch-size Batch size for AI model inference. 8
--confidence Confidence threshold for detections. 0.3
--max-frames Maximum frames to process from the video. None
--output-format Output image format (jpeg or png). jpeg
--output-jpeg-quality Quality for JPEG output (70-100). 95
--resize Maximum dimension for proportional image resizing (256-4096 pixels). None
--min-frames-per-category Minimum frames to output per pose/angle category (1-10). 3
--no-output-face-crop-enabled Disable generation of cropped face images. False
--no-output-full-frame-enabled Disable saving of full-frame images. False
--force Force cleanup of existing temp directory before starting. False
--keep-temp Keep temporary files after processing for debugging. False
--version Show version information and exit. False

For a full list of options, run personfromvid --help.

Output Structure

By default, Person From Vid saves all output files into the same directory as the input video. You can specify a different location with the --output-dir option. All files are prefixed with the base name of the video file.

Here is an example of the output for a video named interview.mp4:

interview_info.json                     # Detailed processing metadata and results
interview_standing_front_closeup_001.jpg  # Full frame: {video}_{pose}_{head}_{shot}_{rank}.jpg
interview_sitting_profile-left_medium-shot_002.jpg
interview_face_front_001.jpg              # Face crop: {video}_face_{head-angle}_{rank}.jpg
interview_face_profile-right_002.jpg
  • {video_base_name}_info.json: A detailed JSON file containing the configuration used, video metadata, and data for every selected frame.
  • Full Frame Images: Saved if output.image.full_frame_enabled is true (default). The filename captures the detected pose, head orientation, and shot type.
  • Face Crop Images: Saved if output.image.face_crop_enabled is true (default). These files contain only the cropped face for easier analysis. All images are saved in a single flat directory.

Configuration

Person From Vid can be configured via a YAML file, environment variables, or command-line arguments.

Configuration File

Create a YAML file (e.g., config.yaml) to manage settings. CLI arguments will override file settings.

# config.yaml

# Models and device settings
models:
  device: "auto"  # "cpu", "gpu", or "auto"
  batch_size: 8
  confidence_threshold: 0.3
  face_detection_model: "yolov8s-face"
  pose_estimation_model: "yolov8s-pose"
  head_pose_model: "sixdrepnet"

# Frame extraction strategy
frame_extraction:
  temporal_sampling_interval: 0.25 # Seconds between samples
  enable_keyframe_detection: true
  max_frames_per_video: null # No limit

# Quality assessment thresholds
quality:
  blur_threshold: 100.0
  brightness_min: 30.0
  brightness_max: 225.0
  contrast_min: 20.0

# Output settings
output:
  min_frames_per_category: 3
  image:
    format: "jpeg" # 'jpeg' or 'png'
    jpeg:
      quality: 95
    png:
      optimize: true
    face_crop_enabled: true
    full_frame_enabled: true
    face_crop_padding: 0.2 # 20% padding

# Processing and storage behavior
processing:
  enable_resume: true
storage:
  cache_directory: "~/.cache/personfromvid"  # Override default cache location
  keep_temp: false                           # Keep temporary files after processing
  force_temp_cleanup: false                  # Force cleanup before starting
  cleanup_temp_on_success: true              # Clean up temp files on success
  cleanup_temp_on_failure: false             # Keep temp files if processing fails

# Logging configuration
logging:
  level: "INFO" # DEBUG, INFO, WARNING, ERROR
  enable_structured_output: true

Use with:

personfromvid video.mp4 --config config.yaml

Development

Setting Up Development Environment

# Clone repository
git clone https://github.com/personfromvid/personfromvid.git
cd personfromvid

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install development dependencies
pip install -e ".[dev]"

# Install pre-commit hooks
pre-commit install

Project Structure

personfromvid/
โ”œโ”€โ”€ personfromvid/           # Main package
โ”‚   โ”œโ”€โ”€ cli.py              # Command-line interface
โ”‚   โ”œโ”€โ”€ core/               # Core processing modules
โ”‚   โ”œโ”€โ”€ models/             # AI model management
โ”‚   โ”œโ”€โ”€ analysis/           # Image analysis and classification
โ”‚   โ”œโ”€โ”€ output/             # Output generation
โ”‚   โ”œโ”€โ”€ utils/              # Utility modules
โ”‚   โ””โ”€โ”€ data/               # Data models and configuration
โ”œโ”€โ”€ tests/                  # Test suite
โ”œโ”€โ”€ docs/                   # Documentation
โ””โ”€โ”€ scripts/                # Development scripts

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=personfromvid

# Run specific test modules
pytest tests/unit/test_config.py

Code Quality

# Format code
black personfromvid/

# Check linting
flake8 personfromvid/

# Type checking
mypy personfromvid/

Cleaning Up

To remove temporary files, build artifacts, and caches, run the cleaning script:

python scripts/clean.py

System Requirements

Minimum Requirements

  • Python 3.10+
  • 4GB RAM
  • 1GB disk space for dependencies and cache
  • FFmpeg

Recommended Requirements

  • Python 3.11+
  • 8GB+ RAM
  • 5GB+ disk space for cache
  • NVIDIA GPU with CUDA support for acceleration
  • FFmpeg with hardware acceleration support

Supported Formats

Video Formats

  • MP4, AVI, MOV, MKV, WMV, FLV, WebM, M4V, 3GP, OGV

Output Formats

  • PNG images (configurable quality)
  • JPEG images (configurable quality)
  • JSON metadata files

Cache and Temporary Files

Person From Vid uses a centralized cache directory to store both AI models and temporary files during video processing. This keeps your video directories clean and makes cache management easier.

Cache Directory Locations

The cache directory is automatically determined based on your operating system:

  • Linux: ~/.cache/personfromvid/
  • macOS: ~/Library/Caches/personfromvid/
  • Windows: C:\Users\{username}\AppData\Local\codeprimate\personfromvid\Cache\

Cache Structure

personfromvid/                  # Base cache directory
โ”œโ”€โ”€ models/                     # AI model files
โ”‚   โ”œโ”€โ”€ yolov8s-face/          # Face detection model
โ”‚   โ”œโ”€โ”€ yolov8s-pose/          # Pose estimation model
โ”‚   โ””โ”€โ”€ sixdrepnet/            # Head pose model
โ””โ”€โ”€ temp/                      # Temporary processing files
    โ””โ”€โ”€ temp_{video_name}/     # Per-video temporary directory
        โ””โ”€โ”€ frames/            # Extracted frames during processing

Temporary Files

During video processing, temporary files (extracted frames, intermediate data) are stored in the cache directory under temp/temp_{video_name}/. These files are:

  • Automatically cleaned up after successful processing (configurable)
  • Kept for debugging if processing fails or if --keep-temp is used
  • Isolated per video to allow concurrent processing of multiple videos

Cache Management

# Keep temporary files after processing (for debugging)
personfromvid video.mp4 --keep-temp

# Force cleanup of existing temp files before starting
personfromvid video.mp4 --force

# Configure cache location via config file
personfromvid video.mp4 --config custom_config.yaml

You can manually clean the cache directory to free up disk space, or configure automatic cleanup in your configuration file.

AI Models

Person From Vid uses the following default AI models, which are automatically downloaded and cached on first use in the cache directory described above.

  • Face Detection: yolov8s-face - A YOLOv8 model trained for face detection.
  • Pose Estimation: yolov8s-pose - A YOLOv8 model for human pose estimation.
  • Head Pose: sixdrepnet - A model for 6DoF head pose estimation.

Alternative models can be configured.

Performance Tips

  1. Use a GPU: The single most effective way to speed up processing is to use an NVIDIA GPU with --device gpu.
  2. Adjust Batch Size: Increase --batch-size to improve GPU utilization. A size of 8 or 16 is a good starting point.
  3. Limit Frame Extraction: Use --max-frames on very long videos to get results faster.

Troubleshooting

Common Issues

FFmpeg not found:

# Check if FFmpeg is installed
ffmpeg -version
# Install if missing (see Prerequisites section)

CUDA/GPU issues:

# Check GPU availability
python -c "import torch; print(torch.cuda.is_available())"
# Fall back to CPU processing
personfromvid video.mp4 --device cpu

Memory issues:

# Reduce batch size
personfromvid video.mp4 --batch-size 1

Permission errors:

# Check output directory permissions
ls -la /path/to/output/directory

Contributing

We welcome contributions! Please see our Contributing Guide for details.

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Submit a pull request

License

This project is licensed under the GPL-3.0-or-later - see the LICENSE file for details.

Support


Person From Vid - Extracting moments, categorizing poses, powered by AI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

personfromvid-1.0.1.tar.gz (140.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

personfromvid-1.0.1-py3-none-any.whl (160.2 kB view details)

Uploaded Python 3

File details

Details for the file personfromvid-1.0.1.tar.gz.

File metadata

  • Download URL: personfromvid-1.0.1.tar.gz
  • Upload date:
  • Size: 140.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for personfromvid-1.0.1.tar.gz
Algorithm Hash digest
SHA256 276e86b8dad6a59cd47e773f36ca61ff29ebd93605a858697b65d55efd15b60d
MD5 eb164d6105eeae91045f16f5b6b2a6a2
BLAKE2b-256 bbf9573cf61a5c8e6a905fa4a65210f1e11d299bdcbf96d32df77d93e3b9242f

See more details on using hashes here.

File details

Details for the file personfromvid-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: personfromvid-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 160.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for personfromvid-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b4437684e28d23b2ab1ea056594a3b756a18fac6e4fdbadad83d094cab1f53fe
MD5 a03f702fb244c27c9b51f529d39e986e
BLAKE2b-256 e9b40b6e93bf87868b0e1202190facc5d1d5cc9dee8c0bc43e3556b076db9838

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page