Skip to main content

YOLO-based human pose detection and visualization tool

Project description

YOLO Poser

A Python package for human pose detection and visualization using YOLO.

Installation

pip install yolo-poser

Usage

Pose Detection

Process videos to detect and visualize human poses.

Command Line:

yolo-poser input_video.mp4 --output output.mp4 --output-format h264

Python API:

from yolo_poser import process_video
process_video(
    input_path="input.mp4",
    output_path="output.mp4",
    output_format="h264",
    debug=True
)

Video Cropping

Automatically crop videos to focus on detected people using YOLO person detection.

Command Line:

yolo-crop input_video.mp4 --padding 0.3 --output cropped.mp4

Options:

  • --padding: Padding around detected area (default: 0.5)
  • --no-keep-proportions: Don't maintain original video proportions
  • --no-preview: Skip preview and confirmation
  • --debug: Save debug frames showing detections
  • --output: Output video path (default: input_cropped.mp4)

Python API:

from yolo_poser import crop_video, calculate_crop_params

# Calculate crop parameters
bbox = (100, 100, 500, 700)  # min_x, min_y, max_x, max_y
crop_params = calculate_crop_params(
    video_path="input.mp4",
    bbox=bbox,
    padding=0.3,
    keep_proportions=True
)

# Apply crop
crop_video(
    input_path="input.mp4",
    output_path="cropped.mp4",
    crop_params=crop_params
)

Video Splitting

Split videos into shots based on audio peaks. Useful for creating training data or breaking down performances.

Command Line:

yolo-split input.mp4 --output-dir shots/ --peak-sensitivity 0.5 --shot-duration 2.5

Options:

  • --output-dir: Directory for output files (default: same as input)
  • --peak-sensitivity: Peak detection sensitivity, 0.1-1.0 (default: 0.8). Lower values create more shots
  • --shot-duration: Duration of each shot in seconds (default: 2.0)
  • --debug: Keep temporary audio files
  • --json: Output JSON list of generated files to stdout

Python API:

from yolo_poser.split import split_video

result = split_video(
    input_path="input.mp4",
    output_dir="shots/",
    peak_sensitivity=0.8,
    shot_duration=2.0
)
# Returns: {'chunks': ['shots/chunk_001.mp4', 'shots/chunk_002.mp4', ...]}

Audio Syncing

Sync audio from one video to another, with automatic duration adjustment.

Python API:

from yolo_poser import sync_audio

output_path = sync_audio(
    source_video="original.mp4",    # Video with audio
    destination_video="processed.mp4",  # Video without audio
    output_path="final.mp4"  # Optional
)

Web API

To use the HTTP API, first install with API dependencies:

pip install "yolo-poser[api]"

Start the API server:

yolo-poser-api [--host HOST] [--port PORT]

For example:

yolo-poser-api --host 127.0.0.1 --port 9000

Or programmatically:

from yolo_poser.api import app
import uvicorn

uvicorn.run(app, host="127.0.0.1", port=9000)

The API provides endpoints for:

  • Processing videos from URLs: POST /detect/url
  • Processing uploaded video files: POST /detect/file
  • Health check: GET /health

See the API documentation at http://localhost:8000/docs when running the server.

Features

  • Pose Detection: Human pose detection and visualization using YOLO
  • Video Cropping: Automatically crop videos to focus on detected people
  • Video Splitting: Split videos into shots based on audio peaks
  • Audio Syncing: Sync audio from one video to another with automatic duration adjustment
  • Multiple Output Formats: Support for MJPEG, H264, and WebM
  • Smooth Tracking: Exponential smoothing for stable keypoint tracking
  • HTTP API: FastAPI-based REST API for video processing
  • Debug Mode: Performance metrics and visualization of detection results

Requirements

  • Python 3.8+ (<3.13)
  • PyTorch
  • Ultralytics YOLO
  • OpenCV
  • NumPy
  • SciPy (for video splitting)
  • FFmpeg (for video processing and audio syncing)

Development

Continuous Integration

This project uses GitHub Actions for continuous integration and deployment:

  • Every push to the main branch triggers a test build that publishes to TestPyPI
  • Tagged releases (e.g. v0.1.0) trigger a build that publishes to PyPI

To release a new version:

  1. Update the version in src/yolo_poser/__init__.py
  2. Commit the changes
  3. Create and push a tag:
git tag v0.1.0
git push origin v0.1.0

The GitHub Action will automatically build and publish the new version to PyPI.

Local Development

  1. Clone the repository:
git clone https://github.com/tomdyson/yolo-poser.git
cd yolo-poser
  1. Install in development mode:
pip install -e .

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yolo_poser-0.1.18.tar.gz (5.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yolo_poser-0.1.18-py3-none-any.whl (5.6 MB view details)

Uploaded Python 3

File details

Details for the file yolo_poser-0.1.18.tar.gz.

File metadata

  • Download URL: yolo_poser-0.1.18.tar.gz
  • Upload date:
  • Size: 5.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for yolo_poser-0.1.18.tar.gz
Algorithm Hash digest
SHA256 79d8986b3ed6935f6727e9574d66377fffea806aa18b408d5b5aadeabfb7be4e
MD5 05480b71c54560aa05cf16b145e6e340
BLAKE2b-256 6478174d5250617de34978e7472d097f723b86cb548cc5ca204205b6525a097f

See more details on using hashes here.

File details

Details for the file yolo_poser-0.1.18-py3-none-any.whl.

File metadata

  • Download URL: yolo_poser-0.1.18-py3-none-any.whl
  • Upload date:
  • Size: 5.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for yolo_poser-0.1.18-py3-none-any.whl
Algorithm Hash digest
SHA256 7ae34a30743363b5694cc6317bf16a9e24bd058e074a7a6246d2882454563ebe
MD5 43ff7469055d7ab5f0d055c609e6c7ff
BLAKE2b-256 d9be66606577cb89bddaacd8e3581d9c2828bd678f30db7c114535f7fc325d15

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page