Skip to main content

Efficiently download HIBP new pwned password data by hash-prefix for a local-copy

Project description

hibp-downloader

pypi python build tests docs license

This is a CLI tool to efficiently download a local copy of the pwned password hash data from the very awesome HIBP pwned passwords api-endpoint using multiprocessing, async-processes, local-caching, content-etags and http2-connection pooling to make things as fast as (seems) Pythonly possible.

Features

  • Only download hash-prefix content blocks when the hash-prefix block content has changed (via content ETAG values).
  • Start, stop and re-start the data-collection process without loss of data already collected.
  • Ability to query clear text values and return results from the pwned password data set.
  • Generate a single text file with pwned password hash values in-order, similar to PwnedPasswordsDownloader from the HIBP team.
  • Per prefix file metadata in JSON format for easy data reuse.

Install

pip install --upgrade hibp-downloader

Usage

screenshot-help.png

Performance

Sample download activity log; host with 12 cores on 45Mbit/s DSL connection.

2023-07-31T03:22:45+1000 | INFO | hibp-downloader | prefix=e585f source=[lc:265201 et:0 rc:722148 ro:3 xx:0] runtime_rate=[11.2MBit/s 86req/s ~71005H/s] runtime=2.33hr download=11748.0MB
2023-07-31T03:22:48+1000 | INFO | hibp-downloader | prefix=e5877 source=[lc:265201 et:0 rc:722268 ro:3 xx:0] runtime_rate=[11.2MBit/s 86req/s ~70998H/s] runtime=2.33hr download=11750.0MB
2023-07-31T03:22:50+1000 | INFO | hibp-downloader | prefix=f5837 source=[lc:265201 et:0 rc:722388 ro:3 xx:0] runtime_rate=[11.2MBit/s 86req/s ~70992H/s] runtime=2.33hr download=11751.9MB
  • 86 requests per second to api.pwnedpasswords.com
  • 265,201 prefix files from (lc) local-cache; 722,388 from (rc) remote-cache; 3 from (ro) remote-origin; 0 failed (xx) download
  • estimated ~70k hash values downloaded per second
  • 11.5GB (11,751MB) downloaded in 2.3 hours (full dataset is ~3.5 hours)

Project

Copyright

All rights reserved.

License

  • BSD-3-Clause - see LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hibp_downloader-0.1.5.tar.gz (18.9 kB view hashes)

Uploaded Source

Built Distribution

hibp_downloader-0.1.5-py3-none-any.whl (24.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page