Video Representations Extractor (VRE) for computing algorithmic or neural representations of each frame.
Project description
Video Representation Extractor
1. Description
The purpose of this repository is to export various representations starting from RGB videos only. Representations are defined as ways of 'looking at the world'. One can watch at various levels of information:
- low level: colors, edges
- mid level: depth, orientation of planes (normals)
- high level: semantics and actions
For GitHub users: this is a mirror of the gitlab repository.
Supported representations
- See here for a comprehensive list, since it updates faster than this README.
Weights repository for supported pretrained neural-network based representations is here.
2. Usage
Installation is as easy as:
pip install video-representations-extractor
You can however, clone this repository and add it to your paths:
git clone https://gitlab.com/meehai/video-representations-extractor [/some/dir]
# in .bashrc
export PYTHONPATH="$PYTHONPATH:/some/dir"
export PATH="$PATH:/some/dir/bin"
After eithrer option, you should be able to run:
vre <path/to/video.mp4> --cfg_path <path/to/cfg> -o <path/to/export_dir>
The magic happens inside the config file, where we define what representations to extract and what parameters are used to instantiate said representations.
2.1 Single image usage
You can get the representations for a single image (or a directory of images) by placing your image in a standalone directory.
vre <path/to/dir_of_images> --cfg_path <path/to/cfg> -o <path/to/export_dir>
Note: use --cfg_path resources/cfgs/testCfg_ootb.yaml
for 'out of the box' working representations.
Note2: Use VRE_DEVICE=cuda vre...
to use cuda. For some representations, this speeds up the process by a lot.
3. CFG file
The config file will have the hyperparameters required to instantiate each supported method as well as global hyperparameters for the output. This means that if a depth method is pre-traied for 0-300m, this information will be encoded in the CFG file and passed to the constructor of that particular depth method. There are also export level parameters, such as the output resolution of the representations.
High level format:
name of representation:
type: some high level type (such as depth, semantic, edges, etc.)
name: the implemented method's name (i.e. dexined, dpt, odoflow etc.)
dependencies: [a list of dependencies given by their names]
parameters: (as defined in the constructor of the implementation)
param1: value1
param2: value2
name of representation 2:
type: some other type
name: some other method
dependencies: [name of representation]
parameters: []
Example cfg file: See out of the box supported representations and the CFG defined in the CI process for an actual export that is done at every commit on a real video.
Note: If the topological sort fails (because cycle dependencies), an error will be thrown.
4. Output format
All the outputs are going to be stored as [0-1] float32 npz files, one for each frame in a directory specified by
--output_dir/-o
. A subdirectory will be created for each representation.
For the above CFG file, 2 subdirectories will be created:
/path/to/output_dir/
name of representation/
npy/ # if export_npy is set
1.npz, ..., N.npz
png/ # if export_png is set
1.png, ..., N.png
name of representation 2/
npy/
1.npz, ..., N.npz
The cfg.yaml
file for each representation is created so that we know what parameters were used for that
representation.
4.1 Collages
In bin/
we provide a secondary tool, vre_collage
that takes all the png files from an output_dir as above and
stacks them together in a single image. This is useful if we want to create a single image of all representations which
can later be turned into a video as well.
Usage:
vre_collage /path/to/output_dir -o /path/to/collae_dir [--overwrite] [--video] [--fps] [--output_resolution H W]
Note: This requries media-processing-lib
to be installed (via pip).
Note: you can also get video from a collage dir like this (in case you forgot to set --video or want more control):
old_path=`pwd`
cd /path/to/collage_dir
ffmpeg -start_number 1 -framerate 30 -i %d.png -c:v libx264 -pix_fmt yuv420p $oldPath/collage.mp4;
cd -;
5. Run in docker
- use
meehai/vre:latest
from docker hub.
mkdir example
# move the cfg and the video in some local dir
gdown https://drive.google.com/uc?id=158U-W-Gal6eXxYtS1ca1DAAxHvknqwAk -O example/vid.mp4
wget https://gitlab.com/meehai/video-representations-extractor/-/raw/df15af177edf5c101bbb241428c43faac333cea4/test/end_to_end/imgur/cfg.yaml -O example/cfg.yaml
docker run \
-v `pwd`/example:/app/resources \
meehai/vre \
/app/resources/vid.mp4 --cfg_path /app/resources/cfg.yaml -o /app/resources/result --start_frame 5 --end_frame 6
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Hashes for video-representations-extractor-1.0.3.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7a4729277cb1cf5f8613e30f0ff16b35dbbd3ca59d2f6cb4dbafc18bee4bfc0f |
|
MD5 | 8bb7a10dde4f8e0f0a50df1f6fc8d8a3 |
|
BLAKE2b-256 | e4f329bbee35d1052d401ec969fc90e81070ce11a91c47e347d971b61d374068 |