Video Sampler -- sample frames from a video file
Project description
video-sampler
Video sampler allows you to efficiently sample video frames. Currently, it uses keyframe decoding, frame interval gating and perceptual hashing to reduce duplicated samples.
Use case: for sampling videos for later annotations used in machine learning.
Table of Contents
Features
- Direct sampling methods:
-
hash
- uses perceptual hashing to reduce duplicated samples -
entropy
- uses entropy to reduce duplicated samples (work in progress) -
gzip
- uses gzip compressed size to reduce duplicated samples (work in progress) -
buffer
- uses sliding buffer to reduce duplicated samples -
grid
- uses grid sampling to reduce duplicated samples
-
- Gating methods (modifications on top of direct sampling methods):
-
clip
- uses CLIP to filter out frames that do not contain the specified objects -
blur
- uses blur detection to filter out frames that are too blurry
-
- Integrations
- YTDLP integration -- streams directly from yt-dlp queries, playlists or single videos
Installation and Usage
pip install -U video_sampler
then you can run
python3 -m video_sampler --help
or simply
video_sampler --help
Basic usage
python3 -m video_sampler hash FatCat.mp4 ./dataset-frames/ --hash-size 3 --buffer-size 20
YT-DLP integration plugin
Before using please consult the ToS of the website you are scraping from -- use responsibly and for research purposes.
To use the YT-DLP integration, you need to install yt-dlp
first (see yt-dlp).
Then, you simply add --yt-dlp
to the command, and it changes the meaning of the video_path
argument.
- to search
video_sampler hash "ytsearch:cute cats" ./folder-frames/ \
--hash-size 3 --buffer-size 20 --yt-dlp
- to sample a single video
video_sampler hash "https://www.youtube.com/watch?v=W86cTIoMv2U" ./folder-frames/ \
--hash-size 3 --buffer-size 20 --yt-dlp
- to sample a playlist
video_sampler hash "https://www.youtube.com/watch?v=GbpP3Sxp-1U&list=PLFezMcAw96RGvTTTbdKrqew9seO2ZGRmk" ./folder-frames/ \
--hash-size 3 --buffer-size 20 --yt-dlp
The videos are never directly downloaded, only streamed, so you can use it to sample videos from the internet without downloading them first.
Extra YT-DLP options
You can pass extra options to yt-dlp by using the -yt-extra-args
flag. For example:
this will only sample videos uploaded before 2019-01-01:
... --ytdlp --yt-extra-args '--datebefore 20190101'
or this will only sample videos uploaded after 2019-01-01:
... --ytdlp --yt-extra-args '--dateafter 20190101'
or this will skip all shorts:
... --ytdlp --yt-extra-args '--match-filter "original_url!*=/shorts/ & url!*=/shorts/"
API examples
See examples in ./scripts.
Advanced usage
There are 3 sampling methods available:
hash
- uses perceptual hashing to reduce duplicated samplesentropy
- uses entropy to reduce duplicated samples (work in progress)gzip
- uses gzip compressed size to reduce duplicated samples (work in progress)
To launch any of them you can run and substitute method-name
with one of the above:
video_sampler buffer `method-name` ...other options
e.g.
video_sampler buffer entropy --buffer-size 20 ...
where buffer-size
for entropy
and gzip
mean the top-k sliding buffer size. Sliding buffer also uses hashing to reduce duplicated samples.
Gating
Aside from basic sampling rules, you can also apply gating rules to the sampled frames, further reducing the number of frames. There are 3 gating methods available:
pass
- pass all framesclip
- use CLIP to filter out frames that do not contain the specified objectsblur
- use blur detection to filter out frames that are too blurry
Here's a quick example of how to use clip:
python3 -m video_sampler clip ./videos ./scratch/clip --pos-samples "a cat" --neg-samples "empty background, a lemur" --hash-size 4
CLIP-based gating comparison
Here's a brief comparison of the frames sampled with and without CLIP-based gating with the following config:
gate_def = dict(
type="clip",
pos_samples=["a cat"],
neg_samples=[
"an empty background",
"text on screen",
"a forest with no animals",
],
model_name="ViT-B-32",
batch_size=32,
pos_margin=0.2,
neg_margin=0.3,
)
Evidently, CLIP-based gating is able to filter out frames that do not contain a cat and in consequence, reduce the number of frames with plain background. It also thinks that a lemur is a cat, which is not entirely wrong as fluffy creatures go.
Pass gate (no gating) | CLIP gate | Grid |
---|---|---|
The effects of gating in numbers, for this particular set of examples (see produced
vs gated
columns). produced
represents the number of frames sampled without gating, here after the perceptual hashing, while gated
represents the number of frames sampled after gating.
video | buffer | gate | decoded | produced | gated |
---|---|---|---|---|---|
FatCat.mp4 | grid | pass | 179 | 31 | 31 |
SmolCat.mp4 | grid | pass | 118 | 24 | 24 |
HighLemurs.mp4 | grid | pass | 161 | 35 | 35 |
FatCat.mp4 | hash | pass | 179 | 101 | 101 |
SmolCat.mp4 | hash | pass | 118 | 61 | 61 |
HighLemurs.mp4 | hash | pass | 161 | 126 | 126 |
FatCat.mp4 | hash | clip | 179 | 101 | 73 |
SmolCat.mp4 | hash | clip | 118 | 61 | 31 |
HighLemurs.mp4 | hash | clip | 161 | 126 | 66 |
Blur gating
Helps a little with blurry videos. Adjust threshold and method (laplacian
or fft
) for best results.
Some results from fft
at threshold=20
:
video | buffer | gate | decoded | produced | gated |
---|---|---|---|---|---|
MadLad.mp4 | grid | pass | 120 | 31 | 31 |
MadLad.mp4 | hash | pass | 120 | 110 | 110 |
MadLad.mp4 | hash | blur | 120 | 110 | 85 |
Benchmarks
Configuration for this benchmark:
SamplerConfig(min_frame_interval_sec=1.0, keyframes_only=True, buffer_size=30, hash_size=X, queue_wait=0.1, debug=True)
Video | Total frames | Hash size | Decoded | Saved |
---|---|---|---|---|
SmolCat | 2936 | 8 | 118 | 106 |
SmolCat | - | 4 | - | 61 |
Fat Cat | 4462 | 8 | 179 | 163 |
Fat Cat | - | 4 | - | 101 |
HighLemurs | 4020 | 8 | 161 | 154 |
HighLemurs | - | 4 | - | 126 |
SamplerConfig(
min_frame_interval_sec=1.0,
keyframes_only=True,
queue_wait=0.1,
debug=False,
print_stats=True,
buffer_config={'type': 'entropy'/'gzip', 'size': 30, 'debug': False, 'hash_size': 8, 'expiry': 50}
)
Video | Total frames | Type | Decoded | Saved |
---|---|---|---|---|
SmolCat | 2936 | entropy | 118 | 39 |
SmolCat | - | gzip | - | 39 |
Fat Cat | 4462 | entropy | 179 | 64 |
Fat Cat | - | gzip | - | 73 |
HighLemurs | 4020 | entropy | 161 | 59 |
HighLemurs | - | gzip | - | 63 |
Benchmark videos
Flit commands
Build
flit build
Install
flit install
Publish
Remember to bump the version in pyproject.toml
before publishing.
flit publish
🛡 License
This project is licensed under the terms of the MIT
license. See LICENSE for more details.
📃 Citation
@misc{video-sampler,
author = {video-sampler},
title = {Video sampler allows you to efficiently sample video frames},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/LemurPwned/video-sampler}}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file video_sampler-0.8.0.tar.gz
.
File metadata
- Download URL: video_sampler-0.8.0.tar.gz
- Upload date:
- Size: 63.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-requests/2.31.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cc722860b97c71a03c9a82ed8b1154ba746a7345cda53e9f392e318100511768 |
|
MD5 | f25c4eace380f386b31befb3a2b423b4 |
|
BLAKE2b-256 | 28c164254dc74ff99ca7f8e01cf9d4f129f22f580f783c61c7139634d66e51a2 |
File details
Details for the file video_sampler-0.8.0-py3-none-any.whl
.
File metadata
- Download URL: video_sampler-0.8.0-py3-none-any.whl
- Upload date:
- Size: 22.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-requests/2.31.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2538db66e0358f3eae040d2f897f535ec29594d6d78826679218c987f1832108 |
|
MD5 | 7117173d21f6ed29a2d3f2414f6f5969 |
|
BLAKE2b-256 | d9445148999336b9975ef03f47eeba115dd08a56cbe8e91d66bb6f9f17653802 |