Skip to main content

Video deduplicator utility for Hydrus Network

Reason this release was yanked:

bad build config

Project description

Hydrus Video Deduplicator

Hydrus Video Deduplicator finds potential duplicate videos through the Hydrus API

PyPI - Version PyPI - Python Version PyPI downloads GitHub Repo stars


How It Works:

The deduplicator works by comparing videos similarity by their perceptual hash.

Potential duplicates can be processed through the Hydrus duplicates processing page just like images.

You can choose to process only a subset of videos with --query using Hydrus tags, e.g. --query="character:edward" will only process videos with the tag character:edward.

For more information check out the wiki and the FAQ


Installation:

Dependencies:

  • Python >=3.10
  • FFmpeg
python3 -m pip install hydrusvideodeduplicator

Usage:

python3 -m hydrusvideodeduplicator --api-key="<your key>"

For full list of options see --help or the usage page.


TODO:

  • Option to rollback and remove potential duplicates
  • OR predicates for --query
  • Parallelize hashing and duplicate search
  • Automatically generate access key with Hydrus API
  • Docker container
  • Upload Docker container to Docker Hub (GitHub Action)
  • Pure Python port of vpdq
  • Windows compatibility without WSL or Docker

Contact:

Create an issue on GitHub for any problems/concerns. Provide as much detail as possible in your issue.

Message @applenanner on the Hydrus Discord for other general questions/concerns


Attribution:

Hydrus Network (DWTFYWTPL)

Hydrus API Library (GNU AGPLv3) by cryzed

pdq (BSD) by Meta

vpdq (BSD) by Meta

Big Buck Bunny, Sintel (CC BY 3.0) clips by Blender Foundation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hydrusvideodeduplicator-0.2.4.tar.gz (47.0 kB view details)

Uploaded Source

Built Distribution

hydrusvideodeduplicator-0.2.4-py3-none-any.whl (53.1 kB view details)

Uploaded Python 3

File details

Details for the file hydrusvideodeduplicator-0.2.4.tar.gz.

File metadata

File hashes

Hashes for hydrusvideodeduplicator-0.2.4.tar.gz
Algorithm Hash digest
SHA256 bdfe065d07d749a5d24efb96a2663ef88fdca60e05313e04c41cac588b7583d3
MD5 637cdf0e456f93d69a03067786ac4909
BLAKE2b-256 ae27d7376a5681086a40eb2f32c69146d76b359479fa01f62c2c9701f9c7eb7b

See more details on using hashes here.

File details

Details for the file hydrusvideodeduplicator-0.2.4-py3-none-any.whl.

File metadata

File hashes

Hashes for hydrusvideodeduplicator-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 31250e5eee497f42fdf29b8d0fd55db914261b4f0de03b9fc367a6106435d72d
MD5 8ad5a4ef401e350488a03b2a9522a578
BLAKE2b-256 f09c0532bdaf531ee99728ed135d8e0743e7d951c4a986ac0a4a4c64dd8b7e91

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page