Python package for Perceptual Video Hashing (Near Duplicate Video Detection) - Get a 64-bit comparable hash-value for any video.
Project description
VideoHash
Python package for Perceptual Video Hashing
Introduction
Videohash is a Python package for Perceptual Video Hashing (Near-Duplicate-Video-Detection). The package can be used to generate a 64-bit comparable hash-value for any video input. The hash-values are the same or similar for identical/near-duplicate videos, which implies that hash-value should remain unchanged or not change drastically for the video if it's resized (upscaled/downscaled), transcoded, slightly-cropped, or black-bars added/removed.
How the hash values are calculated?
- Every one second a frame of the input video is extracted, the frames are resized to a 144x144 pixel square, a collage is created that embeds all the resized frames(square-shaped) in it, the wavelet hash value of the collage is computed, and it is the video hash value for the original input video.
When not to use Videohash?
- Videohash can not be used for verifying if one video is part of another video(video fingerprinting). Videohash doesn't produce the same or similar hash value if the video is reversed or rotated by a significant angle(more than 10 degrees), but you can always reverse the video yourself and generate the hash value for reversed video.
Installation
You must have FFmpeg installed to use this software. If you don't know how to install FFmpeg, please read how to install FFmpeg.
Install videohash
- Using pip:
pip install videohash
- Install directly from GitHub:
pip install git+https://github.com/akamhy/videohash.git
Features
- Generate videohash of a video directly from its URL or its path.
- Can be used to implement scalable Near Duplicate Video Retrieval.
- Image representation of the video is accessible by the end-user.
- An instance of videohash can be compared with a stored hash(64-bit), its hex representation, and other instances of videohash.
- Faster than the primitive process of comparing all the frames one by one. The videohash package produces a single 64-bit hash, a lot of database space is saved. And the number of comparisons required drops significantly.
Usage
>>> from videohash import VideoHash
>>> hash1 = VideoHash(url="https://www.youtube.com/watch?v=PapBjpzRhnA", download_worst=False)
>>> str(hash1)
'0b0011010000011111111011111111111110001111011110000000000000000000'
>>> hash1.hash
'0b0011010000011111111011111111111110001111011110000000000000000000'
>>> hash1.hash_hex
'0x341fefff8f780000'
>>> repr(hash1)
'VideoHash(hash=0b0011010000011111111011111111111110001111011110000000000000000000, hash_hex=0x341fefff8f780000, collage_path=/tmp/tmpe07d_b1g/temp_storage_dir/acn6zsdcb40q/collage/collage.jpg, bits_in_hash=64)'
>>> hash1.collage_path
'/tmp/tmpe07d_b1g/temp_storage_dir/acn6zsdcb40q/collage/collage.jpg'
>>> hash1.bits_in_hash
64
>>> len(hash1)
66
>>> hash2 = VideoHash(url="https://raw.githubusercontent.com/akamhy/videohash/main/assets/rocket.mkv")
>>> hash2.hash
'0b0011010000011111111011111111111110001111011110000000000000000000'
>>> hash2.hash_hex
'0x341fefff8f780000'
>>> hash1.hash_hex
'0x741fcfff8f780000'
>>> hash1 - hash2
0
>>> hash2 - "0x341fefff8f780000"
0
>>> hash1 - "0b0011010000011111111011111111111110001111011110000000000000000000"
2
>>> hash1 == hash2
True
>>> hash1 != hash2
False
>>> hash3 = VideoHash(path="/home/akamhy/Downloads/rocket.mkv")
>>> hash3.hash_hex
'0x341fefff8f780000'
>>> hash3.hash
'0b0011010000011111111011111111111110001111011110000000000000000000'
>>> hash3 - hash2
0
>>> hash3 == hash1
False
>>> hash3 == hash2
True
>>> hash4 = VideoHash(url="https://www.youtube.com/watch?v=_T8cn2J13-4")
>>> hash4.hash_hex
'0x7cffff000000eff0'
>>> hash4 - "0x7cffff000000eff0"
0
>>> hash4.hash
'0b0111110011111111111111110000000000000000000000001110111111110000'
>>> hash4 - "0b0111110011111111111111110000000000000000000000001110111111110000"
0
>>> hash4 == hash3
False
>>> hash4 - hash2
34
>>> hash4 != hash2
True
>>> hash4 - "0b0011010000011111111011111111111110001111011110000000000000000000"
34
>>>
Run the above code @ https://replit.com/@akamhy/videohash-usage-2xx-example-code-for-video-hashing#main.py
Wiki/Extended Usage/Docs : https://github.com/akamhy/videohash/wiki
License
Released under the MIT License. See license for details.
Videos are from NASA and are in the public domain.
NASA videos are in the public domain. NASA copyright policy states that "NASA material is not protected by copyright unless noted".
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for videohash-2.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bb77e0dccd3b8f916665acdc628d9f4ab8266617ba69b907b65834a2edf04169 |
|
MD5 | 8440b17b1fbe95aa708e680b52b16078 |
|
BLAKE2b-256 | 764b6ee736b8264edd5537ba74c74f5d17657862ebd0d693d385f346e1002800 |