Skip to main content

Implementation of ff3 based on python-fpe using cryptography instead of pycryptodome.

Project description

python-fpe-cryptography

GitHub License Build codecov GitHub Tag

Creates format preserving encryption using cryptography instead of pycryptodome. T his is so you can cleanly run this in Databricks sql using UC functions. Based off of https://github.com/mysto/python-fpe and ported to using cryptography AES ECB https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#ECB and if you want to learn more about fpe you can read this: https://github.com/mysto/python-fpe?tab=readme-ov-file#the-ff3-algorithm.

Example here by colleague Andrew Weaver: https://github.com/andyweaves/databricks-notebooks/blob/main/notebooks/privacy/format_preserving_encryption.py

This is a port of it into a python udf. You can find the ported code in fpe.py

Using FPE as a python library with cryptography

Using the FPE method

import secrets
from ff3_cryptography.fpe import crypto_fpe_encrypt, crypto_fpe_decrypt

key = secrets.token_bytes(32).hex()
tweak = secrets.token_bytes(7).hex()

plaintext = '1234567890'
# these functions take care of the radix for you and have reasonable charsets and handle special chars
ciphertext = crypto_fpe_encrypt(key=key, tweak=tweak, input_text=plaintext)
decrypted = crypto_fpe_decrypt(key=key, tweak=tweak, input_text=ciphertext)

assert ciphertext != plaintext, "Encryption failed"
assert plaintext == decrypted, "Decryption failed"

Using raw cipher object (not recommended if you dont know what you are doing)

Keep in mind you will need to modify the radix to match the data you are encrypting. Different char sets need different radix values. I recommend to use the solution provided by Andrew Weaver in the previous example.

import secrets 
from ff3_cryptography.algo import FF3Cipher

key = secrets.token_bytes(32).hex()
tweak = secrets.token_bytes(7).hex()

plaintext = '1234567890'

ff3 = FF3Cipher(key, tweak, radix=10)
ciphertext = ff3.encrypt(plaintext)
decrypted = ff3.decrypt(ciphertext)

assert ciphertext != plaintext, "Encryption failed"
assert plaintext == decrypted, "Decryption failed"

Using FPE in Databricks as UC Functions

Run this to create the function modify the catalog and schema as needed. The best practice for using this function in UC is to split it up into 3 or more functions. One for the python UDF that is private and meant to be used by sql functions designated with fixed encryption keys & tweak fetched from Databricks secrets. The python udf is meant to be private and designated by starting with _. Then you can call the python function by creating a sql function that calls the python function and fills in the encryption key and tweak using the secret sql function. Something like this:

CREATE OR REPLACE FUNCTION encrypt_fpe(text STRING, operation STRING)
RETURNS STRING
DETERMINISTIC
LANGUAGE SQL
-- you may chose to specify functions from another schema
RETURN SELECT_encrypt_decrypt_fpe(
    key => secret("my_scope", "my_encryption_key_hex"),
    tweak => secret("my_scope", "my_tweak_hex"),
    text => text,
    operation => "ENCRYPT"
);

Then you can use the encrypt_fpe function in your sql queries and likewise for decrypt.

In more advanced settings you may have different strategies or different tweaks for different columns or rows designated in the sql function or in another table such that if two different users have the same data they can have different cipher text.

Python UDF Functions (encrypt/decrypt private method)

SQL UDF Functions (encrypt/decrypt public functions with secrets injected)

Using the private python function and messing with it.

Declare variables

You can pass keys in using sql secret commands.

You can generate the key and tweak as hex using the following commands

import secrets

# If needed generate a 256 bit key, store as a secret...
key = secrets.token_bytes(32).hex()

# If needed generate a 7 byte tweak, store as a secret...
tweak = secrets.token_bytes(7).hex()

print(key, tweak)

You can declare them this way or use databricks secrets to manage them.

DECLARE encryption_key="55bd9c16d82731fb15057fcb4bd10dddd385d679927355cec976dc1f956f0559";
DECLARE fpe_tweak="e333ac1b0ae092";
DECLARE plain_text="Hello world";

Encrypt

SELECT main.default.encrypt_decrypt_fpe(
    key => encryption_key,
    tweak => fpe_tweak,
    input_text => plain_text,
    operation => "ENCRYPT"
);

Create cipher text variable

DECLARE cipher_text STRING;
SET VAR cipher_text=(SELECT main.default.encrypt_decrypt_fpe(
    key => encryption_key,
    tweak => fpe_tweak,
    text => plain_text,
    operation => "ENCRYPT"
));

Decrypt

SELECT main.default.encrypt_decrypt_fpe(
    key => encryption_key,
    tweak => fpe_tweak,
    text => cipher_text,
    operation => "DECRYPT"
);

Disclaimer

python-fpe-cryptography is not developed, endorsed not supported by Databricks. It is provided as-is; no warranty is derived from using this package. For more details, please refer to the license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ff3-cryptography-0.1.0.tar.gz (36.6 kB view details)

Uploaded Source

Built Distribution

ff3_cryptography-0.1.0-py3-none-any.whl (14.2 kB view details)

Uploaded Python 3

File details

Details for the file ff3-cryptography-0.1.0.tar.gz.

File metadata

  • Download URL: ff3-cryptography-0.1.0.tar.gz
  • Upload date:
  • Size: 36.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for ff3-cryptography-0.1.0.tar.gz
Algorithm Hash digest
SHA256 1a2e9fc3b548d5ea18c5b68f05a762fe7204c5eec84f46b9d1e8134a6774961e
MD5 95024c26ba3df91f1665c89bddb423c5
BLAKE2b-256 a35b621adf0ebe6faf644f88ac94a9c83ebb7248ec80bc60ebd64624301d0d45

See more details on using hashes here.

File details

Details for the file ff3_cryptography-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ff3_cryptography-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4723bf33d62a3d960b730d96b3a0a5b5838536e9f2ef21c612e245d65bdd1351
MD5 1ef8ab8f53f5dfce4e63e8281d5bf170
BLAKE2b-256 54ed457f178013354595281257a92db797d15436b316dee43f111fdc34eee1b1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page