Skip to main content

Efficiently download HIBP new pwned password data by hash-prefix for a local-copy

Project description

hibp-downloader

pypi python build tests docs license

This is a CLI tool to efficiently download a local copy of the pwned password hash data from the very awesome HIBP pwned passwords api-endpoint using multiprocessing, async-processes, local-caching, content-etags and http2-connection pooling to make things as fast as (seems) Pythonly possible.

Features

  • Only download hash-prefix content blocks when the hash-prefix block content has changed (via content ETAG values).
  • Start, stop and re-start the data-collection process without loss of data already collected.
  • Ability to query clear text values and return results from the pwned password data set.
  • Generate a single text file with pwned password hash values in-order, similar to PwnedPasswordsDownloader from the HIBP team.
  • Per prefix file metadata in JSON format for easy data reuse.

Install

pip install --upgrade hibp-downloader

Usage

screenshot-help.png

Performance

Sample download activity log; host with 12 cores on 45Mbit/s DSL connection.

2023-07-31T03:22:45+1000 | INFO | hibp-downloader | prefix=e585f source=[lc:265201 et:0 rc:722148 ro:3 xx:0] runtime_rate=[11.2MBit/s 86req/s ~71005H/s] runtime=2.33hr download=11748.0MB
2023-07-31T03:22:48+1000 | INFO | hibp-downloader | prefix=e5877 source=[lc:265201 et:0 rc:722268 ro:3 xx:0] runtime_rate=[11.2MBit/s 86req/s ~70998H/s] runtime=2.33hr download=11750.0MB
2023-07-31T03:22:50+1000 | INFO | hibp-downloader | prefix=f5837 source=[lc:265201 et:0 rc:722388 ro:3 xx:0] runtime_rate=[11.2MBit/s 86req/s ~70992H/s] runtime=2.33hr download=11751.9MB
  • 86 requests per second to api.pwnedpasswords.com
  • 265,201 prefix files from (lc) local-cache; 722,388 from (rc) remote-cache; 3 from (ro) remote-origin; 0 failed (xx) download
  • estimated ~70k hash values downloaded per second
  • 11.5GB (11,751MB) downloaded in 2.3 hours (full dataset is ~3.5 hours)

Project

Copyright

All rights reserved.

License

  • BSD-3-Clause - see LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hibp_downloader-0.1.5.tar.gz (18.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hibp_downloader-0.1.5-py3-none-any.whl (24.0 kB view details)

Uploaded Python 3

File details

Details for the file hibp_downloader-0.1.5.tar.gz.

File metadata

  • Download URL: hibp_downloader-0.1.5.tar.gz
  • Upload date:
  • Size: 18.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.3 readme-renderer/35.0 requests/2.28.1 requests-toolbelt/0.9.1 urllib3/1.26.13 tqdm/4.64.0 importlib-metadata/4.6.4 keyring/23.5.0 rfc3986/1.5.0 colorama/0.4.4 CPython/3.10.12

File hashes

Hashes for hibp_downloader-0.1.5.tar.gz
Algorithm Hash digest
SHA256 463b0f34ace217bff068f67bf1bcf6c22db438515a0141fdefc58ad4c498a765
MD5 15c98f3fa373f83856a566c9d4d4aed6
BLAKE2b-256 e338248c69758fd762b404ff024392e02cd76f32015b534000171f072929ee37

See more details on using hashes here.

File details

Details for the file hibp_downloader-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: hibp_downloader-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 24.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.3 readme-renderer/35.0 requests/2.28.1 requests-toolbelt/0.9.1 urllib3/1.26.13 tqdm/4.64.0 importlib-metadata/4.6.4 keyring/23.5.0 rfc3986/1.5.0 colorama/0.4.4 CPython/3.10.12

File hashes

Hashes for hibp_downloader-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 6430bfb7f4e392db78b5b520158b22bc7417bec2339c9bdbb330ecd2bf8f8b9d
MD5 f8c0c54da303de794496ed1fe6d9b498
BLAKE2b-256 f7f4fe021b76bddfe62f66e1b7a9dc373bdc5d10befcc3d20ea56b65db8d0202

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page