Skip to main content

Multilayer SHA-512 hashing utility for PII columns in CSV/TXT files

Project description

d2-hasher

Multilayer SHA-512 hashing utility for PII columns in CSV/TXT files.

Installation

pip install d2-hasher

Quick Start

Python API

from d2_hasher import hash_columns

# Hash columns in a CSV file
hash_columns(
    input_file="data.csv",
    columns=["national_id", "name"],
    secret_salts=["your_secret_salt_1", "your_secret_salt_2"],
)
# Output written to data_hashed.csv

# Hash a DataFrame in-memory
import pandas as pd
df = pd.read_csv("data.csv")
result = hash_columns(
    df=df,
    columns=["national_id", "name"],
    secret_salts=["your_secret_salt_1", "your_secret_salt_2"],
)

Command Line

d2-hasher \
  --input-file data.csv \
  --columns national_id name \
  --secret-salts "your_secret_salt_1" "your_secret_salt_2" \
  --output data_hashed.csv \
  --chunksize 50000

hash_columns() Parameters

Parameter Type Required Default Description
columns list[str] Yes Column names to hash
secret_salts list[str] Yes Secret salts applied in order
input_file str No* Path to CSV/TXT file
df DataFrame No* In-memory DataFrame
output str No <stem>_hashed.csv Output file path
chunksize int No 10000 Rows per chunk
delimiter str No auto-detect Column delimiter

* At least one of input_file or df must be provided.

Algorithm

For each value:

  1. Normalize: CID must be exactly 13 consecutive Arabic digits with no hyphens (-), spaces, or any special characters
  2. For each secret salt: SHA-512(value + salt) → 128-char hex string
  3. Use the output of one round as input to the next

Null/NaN values are returned as None without hashing.

CLI Reference

d2-hasher --input-file FILE --columns COL [COL ...] --secret-salts SALT [SALT ...]
          [--output FILE] [--chunksize N] [--delimiter CHAR]

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

d2_hasher-1.0.4.tar.gz (7.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

d2_hasher-1.0.4-py3-none-any.whl (6.3 kB view details)

Uploaded Python 3

File details

Details for the file d2_hasher-1.0.4.tar.gz.

File metadata

  • Download URL: d2_hasher-1.0.4.tar.gz
  • Upload date:
  • Size: 7.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for d2_hasher-1.0.4.tar.gz
Algorithm Hash digest
SHA256 56b43d7889f8d5ab5160582bcc7bbd2e9bfaef3d6b5e26607a75d4ccfacd83ca
MD5 a4b620e0f9bd17e28e26175c0aac9b9e
BLAKE2b-256 0708d2de389551e0b637b800e02c8f557391abaf3769bc2574c5dd6896b15b66

See more details on using hashes here.

File details

Details for the file d2_hasher-1.0.4-py3-none-any.whl.

File metadata

  • Download URL: d2_hasher-1.0.4-py3-none-any.whl
  • Upload date:
  • Size: 6.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for d2_hasher-1.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 2f00ca2e480208396ddbbf4bdd5cf7a5cf5fa558e64a5564be66d3f0ada05149
MD5 1d6b01eace5824b2f468ec5587e2dc3a
BLAKE2b-256 11e77bcb211b0ffc47d04a3f44bc1c1a6f9198449e908dbecb6c1c82c4bc727c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page