Implementation of ff3 based on python-fpe using cryptography instead of pycryptodome.
Project description
python-fpe-cryptography
Creates format preserving encryption using cryptography instead of pycryptodome. This is so you can cleanly run this in Databricks sql using UC functions. Based off of https://github.com/mysto/python-fpe and ported to using cryptography AES ECB https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#ECB and if you want to learn more about fpe you can read this: https://github.com/mysto/python-fpe?tab=readme-ov-file#the-ff3-algorithm.
Example here by colleague Andrew Weaver: https://github.com/andyweaves/databricks-notebooks/blob/main/notebooks/privacy/format_preserving_encryption.py
This is a port of it into a python udf. You can find the ported code in fpe.py
Install in notebook or python
The library is pinned to use cryptography>=43.0.1,<44.0.0
. In Databricks, you may
have another version of cryptography, but it is very highly likely to work. Please let us
know if you see any issues.
pip install ff3-cryptography
Using FPE as a python library with cryptography
Using the FPE method
import secrets
from ff3_cryptography.fpe import crypto_fpe_encrypt, crypto_fpe_decrypt
key = secrets.token_bytes(32).hex()
tweak = secrets.token_bytes(7).hex()
plaintext = '1234567890'
# these functions take care of the radix for you and have reasonable charsets and handle special chars
ciphertext = crypto_fpe_encrypt(key=key, tweak=tweak, input_text=plaintext)
decrypted = crypto_fpe_decrypt(key=key, tweak=tweak, input_text=ciphertext)
assert ciphertext != plaintext, "Encryption failed"
assert plaintext == decrypted, "Decryption failed"
Using raw cipher object (not recommended if you dont know what you are doing)
Keep in mind you will need to modify the radix to match the data you are encrypting. Different char sets need different radix values. I recommend to use the solution provided by Andrew Weaver in the previous example.
import secrets
from ff3_cryptography.algo import FF3Cipher
key = secrets.token_bytes(32).hex()
tweak = secrets.token_bytes(7).hex()
plaintext = '1234567890'
ff3 = FF3Cipher(key, tweak, radix=10)
ciphertext = ff3.encrypt(plaintext)
decrypted = ff3.decrypt(ciphertext)
assert ciphertext != plaintext, "Encryption failed"
assert plaintext == decrypted, "Decryption failed"
Using FPE in Databricks as UC Functions
Run this to create the function modify the catalog and schema as needed. The best practice for using this function in UC
is to split it up into 3 or more functions. One for the python UDF that is private and meant to be used by sql functions designated
with fixed encryption keys & tweak fetched from Databricks secrets. The python udf is meant to be private and designated
by starting with _
. Then you can call the python function by creating a sql function that calls the python function and
fills in the encryption key and tweak using the secret
sql function.
Python UDF Functions (encrypt/decrypt private method)
- For a reference encrypt look at 01_python_udf.sql.
SQL UDF Functions (encrypt/decrypt public functions with secrets injected)
- For a reference encrypt look at 02_encrypt_sql_udf.sql.
- For a reference decrypt look at 03_decrypt_sql_udf.sql.
The SQL UDF Functions will look something like this:
CREATE OR REPLACE FUNCTION encrypt_fpe(text STRING, operation STRING)
RETURNS STRING
DETERMINISTIC
LANGUAGE SQL
-- you may chose to specify functions from another schema
RETURN SELECT_encrypt_decrypt_fpe(
key => secret("my_scope", "my_encryption_key_hex"),
tweak => secret("my_scope", "my_tweak_hex"),
text => text,
operation => "ENCRYPT"
);
Then you can use the encrypt_fpe
function in your sql queries and likewise for decrypt.
In more advanced settings you may have different strategies or different tweaks for different columns or rows designated in the sql function or in another table such that if two different users have the same data they can have different cipher text.
Using the private python function and messing with it.
Declare variables
You can pass keys in using sql secret commands.
You can generate the key and tweak as hex using the following commands
import secrets
# If needed generate a 256 bit key, store as a secret...
key = secrets.token_bytes(32).hex()
# If needed generate a 7 byte tweak, store as a secret...
tweak = secrets.token_bytes(7).hex()
print(key, tweak)
You can declare them this way or use databricks secrets to manage them.
DECLARE encryption_key="55bd9c16d82731fb15057fcb4bd10dddd385d679927355cec976dc1f956f0559";
DECLARE fpe_tweak="e333ac1b0ae092";
DECLARE plain_text="Hello world";
Encrypt
SELECT main.default.encrypt_decrypt_fpe(
key => encryption_key,
tweak => fpe_tweak,
input_text => plain_text,
operation => "ENCRYPT"
);
Create cipher text variable
DECLARE cipher_text STRING;
SET VAR cipher_text=(SELECT main.default.encrypt_decrypt_fpe(
key => encryption_key,
tweak => fpe_tweak,
text => plain_text,
operation => "ENCRYPT"
));
Decrypt
SELECT main.default.encrypt_decrypt_fpe(
key => encryption_key,
tweak => fpe_tweak,
text => cipher_text,
operation => "DECRYPT"
);
Disclaimer
python-fpe-cryptography is not developed, endorsed not supported by Databricks. It is provided as-is; no warranty is derived from using this package. For more details, please refer to the license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ff3-cryptography-0.2.0.tar.gz
.
File metadata
- Download URL: ff3-cryptography-0.2.0.tar.gz
- Upload date:
- Size: 36.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8ae370169abf72ea059358e180d587658b3c2ec7020c63a467cc1ebdba491ec0 |
|
MD5 | b6440a3bd4fc6c8bdcbc70f347eb46a7 |
|
BLAKE2b-256 | 4ee14b09d81731c7eb368d6c1d3da27d157b42a486d98e6050893622fd4a3c83 |
File details
Details for the file ff3_cryptography-0.2.0-py3-none-any.whl
.
File metadata
- Download URL: ff3_cryptography-0.2.0-py3-none-any.whl
- Upload date:
- Size: 14.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 17d04dbc91a6fcd50a7d62a21722db51e51c12da40174a48d72108c6b5f29ffa |
|
MD5 | 69c624cff157d4da7254f31ca0b590a2 |
|
BLAKE2b-256 | a5c54075fe5eef7573119b75560f7a0a5abbc06d2d865560a8c042cf7e9f0461 |