Securely hash CSV data with HMAC keyed authentication.
hmac4csv: conveniently apply HMAC to hash CSV files
This package applies the Hashed Message Authentication Code (HMAC) Python built-in module to turn plain text CSV data into hashes, combining a secret key with the SHA-256 hashing algorithm.
hmac4csv utility provided is a one-line executable to securely hash entire data files.
$ cat ./sample.csv ID,SSN,FNAME,MNAME,LNAME 555555,123456789,John,Deer,Doe 10101010,111111111,Jane,Cat,Coe 346712,999999999,Cynthia,Mouse,Moe 987654,444444444,David,Fish,Foe 23232323,888888888,Susan,Duck,Hoe $ hmac4csv sample.csv --key="a-key-shared-only-with-those-who-need-it-for-hashing" --exclude=ID 1 files to hash: sample.csv --> hashed/sample.csv 1 hashed files written. $ cat ./hashed/sample.csv ID,SSN,FNAME,MNAME,LNAME 555555,4714786a9f0ab7ef3ac6bc6faa3ab41b7ab306c386a92bd6adb4d1c24618ef0a,71049a572a1e480fbaea8e1995da39aad40ad6b81f8c0745067da4f19f30b1b1,4192cf14e31eedc899dfad95f37ac5ebbbc4e39cd3927a543bca0ea3cb162010,a59c25e6feac6e8e43ca240c39d9d0ad26c09b4167dadac17c93948df83d5709 10101010,58288e804cbf742f6c3d1fdd2df8e86e1434c9eb86f94961692c6dc7fcf589a5,9ff5bf6df33b322fb13b3029918b0d6ab837ff383ca55eb26761e96f8500a90b,d5faef47cf8ac6d4e8adb78e615501ef7d03f21f87935a42b245eda3197e43b5,be6bf5f565e94095d6094595cc96fdce80ab865966fa231ad5e708ad83f21abe 346712,71237681882d923c87358e16caafe1a98842e8aab1c9d7c30aa275afad2391a9,4f4e1ec36d273876665ac75003c153a3e010db9f3761fed7d0f04314b55ab61f,2daaab4a7ae793fd1c99a3636adc1d5d529e65a6021baeec9a2d1cff7e409550,aa6a549b8dd83563be4ef2b5de4088b2e5f6ca28648dd9b58a425eec290054ff 987654,02578a90ca1c30d130658582c2cb9c0c22817539d382ac21a35f6f3e8b2cb98e,ddaa6b44cf6d157c2a18438e50f7b9673d00073ad993553c28c3cf4b6c39dce6,a6be494f3e918808893e25323fecf765be085d5c98e414fdcd49e925ffc7f96b,91454db4e8550d1dc33113d99464d903d7ab47cde10ee82500f5d7528756b232 23232323,dcebe57e6ca62f10cbdd80957df0fcc2e16b7ee28c5938cfa8ffbd5864aac119,14c7b04b45c5cbe49f9bdbfc56580a1bf5f4c43ada2a3cdce5b66cdaf33d16d2,c73fda97afc3738bf000925ca18b81144db34c9c0a5eb6da4ba411fafdddb100,0562a66686e6df6cfd08732c49ac080a327eaade6b36b38ca318fdb844f9f41b
hmac4csv later (or any HMAC-SHA-256 implementation)
using the same secret key (
will yield identical hashes
Without knowing the secret key, it is functionally impossible to directly encode or decode these values.
0. Get some data in CSV
Presumably, if you didn't have data, you wouldn't be looking at
1. Create the secret key
hmac4csv is designed to allow the user to pass a text passphrase as a key.
hmac4csv will automatically convert any ordinary string into the byte-based key
for HMAC. (Specifically, it will first encode the string via utf-8
and then hash it with SHA-256.)
If the key is being shared across parties, a lengthy passphrase* can be useful:
$ cat secret.ini key = how to recognise different trees from quite a long way away
If you'd prefer to separately create the key, pass the hexademical representation.
hmac4csv will decode the hex into bytes to use directly as the HMAC key.
For example, you might prefer a different hash function, or you might want to
generate a completely random key separately.
$ cat secret.ini key = 6083717701d662b94314ea9de278224c2800e854deeef1091a9b99734f46d7f7
2. Store the secret key
hmac4csv command line utility, the key may be specified in three ways.
hmac4csv will search for the key in this order:
--keyoption on the command line.
An environment variable called
secret.iniin the current directory.
hmac4csvwill look for a
key=line in the first section.
- A different file may be specified using the
3. Consider excluding columns
It's often useful to preserve one or a few columns. In the first example above, we preserved the ID column in its original form. That would let us link back to any other information from our data source.
--exclude option, pass a comma-separated list of columns that will
not be altered in the hashed csv that
4. Consider a dry run
--dry-run or simply
-n option, the list of CSVs to be hashed
and their output filenames will be listed but no files will be written.
Files that would be overwritten on a full
hmac4csv run are noted.
You've created your key and considered whether to keep any columns as they are.
You've done a dry run to confirm that it's collecting and translating the files you expect.
You're ready for a full run of
hmac4csv to get that information securely hashed.
Bear in mind that this is a hashing algorithm. It is useful because we can compare exact values without observing any raw values.
Without the key, you can learn nothing from the hashed data and you cannot (reasonably) create comparable hashes.
Even with the key, the original content cannot (reasonably) be recovered directly from the hashed value. This is hashing, not encryption.
If you'd like to understand more about HMAC, head over to Wikipedia:
In cryptography, an HMAC (sometimes expanded as either keyed-hash message authentication code or hash-based message authentication code) is a specific type of message authentication code (MAC) involving a cryptographic hash function and a secret cryptographic key. It may be used to simultaneously verify both the data integrity and the authenticity of a message, as with any MAC. Any cryptographic hash function, such as SHA-256 or SHA-3, may be used in the calculation of an HMAC; the resulting MAC algorithm is termed HMAC-X, where X is the hash function used (e.g. HMAC-SHA256 or HMAC-SHA3). The cryptographic strength of the HMAC depends upon the cryptographic strength of the underlying hash function, the size of its hash output, and the size and quality of the key.
I also found the following helpful in learning about HMAC:
Release history Release notifications
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size hmac4csv-2019.12.4-py3-none-any.whl (10.6 kB)||File type Wheel||Python version py3||Upload date||Hashes View hashes|
|Filename, size hmac4csv-2019.12.4.tar.gz (16.4 kB)||File type Source||Python version None||Upload date||Hashes View hashes|
Hashes for hmac4csv-2019.12.4-py3-none-any.whl