Skip to main content

Easily compute clip embeddings from video frames

Project description

clip-video-encode

pypi Open In Colab Try it on gitpod

Easily compute clip embeddings from video frames.

Install

Using pip:

pip install clip-video-encode

Or build from source:

python setup.py install

Usage

NAME
    clip-video-encode - Encode frames using CLIP image encoder

SYNOPSIS
    clip-video-encode SRC <flags>

DESCRIPTION
    Input:
      src:
        str: path to mp4 file
        str: youtube link
        str: path to txt file with multiple mp4's or youtube links
        list: list with multiple mp4's or youtube links
      dest:
        str: directory where to save embeddings to
        None: dest = src + .npy
      output_format:
        str: "files" or "webdataset"
      take_every_nth:
        int: only take every nth frame
      frame_workers:
        int: number of Processes to distribute video reading to.
      frame_memory_size:
        int: GB of memory for FrameReader.
      metadata_columns:
        str: a comma separated list of metadata column names to look for in src
      use_dst_name:
        bool: use the save name suggested by video2numpy
      distribute:
        str: distribution strategy, currently either slurm or none
      oc_model_name:
        str: open_clip model name, used for selecting CLIP architecture
      pretrained:
        str: open_clip pretrained weights name

POSITIONAL ARGUMENTS
    SRC

FLAGS
    --dest=DEST
        Default: ''
    --output_format=OUTPUT_FORMAT
        Default: 'files'
    --take_every_nth=TAKE_EVERY_NTH
        Default: 1
    --frame_workers=FRAME_WORKERS
        Default: 1
    --frame_memory_size=FRAME_MEMORY_SIZE
        Default: 4
    --metadata_columns=METADATA_COLUMNS
        Default: ''
    --use_dst_name=USE_DST_NAME
        Default: False
    --distribute=DISTRIBUTE
        Default: 'none'
    --oc_model_name=OC_MODEL_NAME
        Default: 'ViT-B-32'
    --pretrained=PRETRAINED
        Default: 'laion2b_s34b_b79k'

API

This module exposes a single function clip_video_encode which takes the same arguments as the command line tool:

import glob
from clip_video_encode import clip_video_encode

VIDS = glob.glob("some/path/my_videos/*.mp4")
EMBEDDING_DIR = "some/path/my_embeddings"
take_every_5 = 5

clip_video_encode(VIDS, EMBEDDING_DIR, take_every_5)

Who is using clip-video-encode?

  • CLIP-Kinetics700 - The Kinetics700 dataset (700GB) can be compressed to ~8GB using clip-video-encode at 1 FPS
  • CLIP-WebVid - The WebVid dataset (10M videos) encoded as CLIP ViT-B/32 embeddings at 1 FPS.

Examples

Check out some cool clip-video-encode examples:

  • Thing detector - Look for things in videos using clip-video-encode generated embeddings.
  • Large dataset processing - If you want to process a large dataset (like WebVid) into CLIP embeddings see the example at the bottom of the linked README.md.

Setup a virtualenv:

python3 -m venv .env
source .env/bin/activate
pip install -e .

to run tests:

pip install -r requirements-test.txt

then

make lint
make test

You can use make black to reformat the code

python -m pytest -x -s -v tests -k "dummy" to run a specific test

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clip-video-encode-1.3.0.tar.gz (14.0 kB view details)

Uploaded Source

Built Distribution

clip_video_encode-1.3.0-py3-none-any.whl (17.8 kB view details)

Uploaded Python 3

File details

Details for the file clip-video-encode-1.3.0.tar.gz.

File metadata

  • Download URL: clip-video-encode-1.3.0.tar.gz
  • Upload date:
  • Size: 14.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.14

File hashes

Hashes for clip-video-encode-1.3.0.tar.gz
Algorithm Hash digest
SHA256 62d854d722d436d264d4f4281b2317a4d533695402916ef58d9d5e793f126307
MD5 4d5c8747be5c3bf4f4ac65b3e39a9d3a
BLAKE2b-256 0491441f37fc7bcc2da0da0aa8471e9e64c98d11806f2f78cf8766bace7d7413

See more details on using hashes here.

File details

Details for the file clip_video_encode-1.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for clip_video_encode-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5f5508eb962046df17dba05e81a87ef7ca45f4c8c86f3b556b8146967cc2e36f
MD5 52406b55bcbbd66d727c1ddfde8cdfad
BLAKE2b-256 ba9e25825587199e90eba2d32da3b3e10c2dd36d2cd592f44e19094763a73305

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page