Skip to main content

Effortlessly mask and encrypt your data frames for safe travel from computer to computer

Project description

DataHowdah ✨🔒🔢✨

DataHowdah: Effortlessly mask and 🔐 encrypt your data frames, ensuring their safety like a camel's secure howdah 🐪

1704128527248

DataHowdah ✨🔒🔢✨

DataHowdah is a utility python class designed to mask or encrypt pandas data frames while sharing them on different machines. It extends the functionality of pandas DataFrames, offering encryption and noise addition features for sensitive data columns. By using plain simple passwords or integrating with Microsoft Azure Key Vault (and AWS KMS), DataHowdah ensures that encryption keys are managed securely and efficiently. Its intuitive interface allows for seamless encryption and decryption of DataFrame columns, safeguarding data as it travels between systems.

from data_howdah import DataHowdah

df = DataHowdah('sample.csv')
df.encrypt(['secret_column_1', 'secret_column_2'])
df.to_csv('encrypted_data.csv', index=False)

🌱 How to Begin

  • 💻 Install :
pip install data-howdah
  • 🔑 Environment variables may contain your either your simple Encryption Key :
DATA_HOWDAH_ENCRYPTION_KEY=

or your AZURE KEY VAULT secret url:

DATA_HOWDAH_AZURE_KEY_VAULT_SECRET_URL=https://{key-vault-name}.vault.azure.net/{secret_name}

Don't forget to install Azure CLI and login successfully

az login

AWS KMS (coming soon!)

  • 📈 Create or load your dataframe and pass it to DataHowdah:
import pandas as pd
import numpy as np

# load directly from file, supports popular file paths
df = DataHowdah('sample.csv')
# or pass to it a dataframe object
df = DataHowdah(pd.DataFrame(np.random.rand(100, 3), columns=['A', 'B', 'C']))
ℹ️ DataHowdah is derived class from pd.DataFrame, so you can deal with as a regular dataframe
  • 🎭 Mask :

The mask method adds statistical noise to numeric columns in a DataFrame, effectively disguising the original data while preserving its overall structure and distribution.

from data_howdah import DataHowdah

df = DataHowdah('sample.csv')
df.mask([0], scale = 0.5, plot = True)

# Params :
# scale: Adjusts the intensity of noise added to numeric data, with higher values increasing data obfuscation.
# plots: When set to True, generates comparative visualizations of original and masked data distributions.

df.to_csv('masked_data.csv', index=False) # Save it in the format you like !
  • 🔒 Encrypt :

The encrypt method in DataHowdah securely transforms specified columns in a DataFrame into unreadable formats, safeguarding sensitive information.

from data_howdah import DataHowdah

df = DataHowdah('sample.csv')
df.encrypt(['secret_column', 1, slice(5, 8)])

# Params :
# 1. columns_to_encrypt: Specifies which DataFrame columns to secure, accepting column names, indices, or slices.
# 2. key: The encryption key used for data protection.

df.to_csv('encrypted_data.csv', index=False) # Save it in the format you like !
  • 🔓 Decrypt :

The decrypt method in DataHowdah reverses the encryption of the pre-encrypted columns, it auto detects these columns

from data_howdah import DataHowdah

df = DataHowdah('encrypted_data.csv')
df.decrypt()

df.to_csv('decrypted_data.csv', index=False) # Save it in the format you like !

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data-howdah-1.0.3.tar.gz (7.6 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page