Skip to main content

Effortlessly mask and encrypt your data frames for safe travel from computer to computer

Project description

DataHowdah ✨🔒🔢✨

DataHowdah: Effortlessly mask and 🔐 encrypt your data frames, ensuring their safety like a camel's secure howdah 🐪

1704128527248

DataHowdah ✨🔒🔢✨

DataHowdah is a utility python class designed to mask or encrypt pandas data frames while sharing them on different machines. It extends the functionality of pandas DataFrames, offering encryption and noise addition features for sensitive data columns. By using plain simple passwords or integrating with Microsoft Azure Key Vault (and AWS KMS), DataHowdah ensures that encryption keys are managed securely and efficiently. Its intuitive interface allows for seamless encryption and decryption of DataFrame columns, safeguarding data as it travels between systems.

from data_howdah import DataHowdah

df = DataHowdah('sample.csv')
df.encrypt(['secret_column_1', 'secret_column_2'])
df.to_csv('encrypted_data.csv', index=False)

🌱 How to Begin

  • 💻 Install :
pip install data-howdah
  • 🔑 Environment variables may contain your either your simple Encryption Key :
DATA_HOWDAH_ENCRYPTION_KEY=

or your AZURE KEY VAULT secret url:

DATA_HOWDAH_AZURE_KEY_VAULT_SECRET_URL=https://{key-vault-name}.vault.azure.net/{secret_name}

Don't forget to install Azure CLI and login successfully

az login

AWS KMS (coming soon!)

  • 📈 Create or load your dataframe and pass it to DataHowdah:
import pandas as pd
import numpy as np

# load directly from file, supports popular file paths
df = DataHowdah('sample.csv')
# or pass to it a dataframe object
df = DataHowdah(pd.DataFrame(np.random.rand(100, 3), columns=['A', 'B', 'C']))
ℹ️ DataHowdah is derived class from pd.DataFrame, so you can deal with as a regular dataframe
  • 🎭 Mask :

The mask method adds statistical noise to numeric columns in a DataFrame, effectively disguising the original data while preserving its overall structure and distribution.

from data_howdah import DataHowdah

df = DataHowdah('sample.csv')
df.mask([0], scale = 0.5, plot = True)

# Params :
# scale: Adjusts the intensity of noise added to numeric data, with higher values increasing data obfuscation.
# plots: When set to True, generates comparative visualizations of original and masked data distributions.

df.to_csv('masked_data.csv', index=False) # Save it in the format you like !
  • 🔒 Encrypt :

The encrypt method in DataHowdah securely transforms specified columns in a DataFrame into unreadable formats, safeguarding sensitive information.

from data_howdah import DataHowdah

df = DataHowdah('sample.csv')
df.encrypt(['secret_column', 1, slice(5, 8)])

# Params :
# 1. columns_to_encrypt: Specifies which DataFrame columns to secure, accepting column names, indices, or slices.
# 2. key: The encryption key used for data protection.

df.to_csv('encrypted_data.csv', index=False) # Save it in the format you like !
  • 🔓 Decrypt :

The decrypt method in DataHowdah reverses the encryption of the pre-encrypted columns, it auto detects these columns

from data_howdah import DataHowdah

df = DataHowdah('encrypted_data.csv')
df.decrypt()

df.to_csv('decrypted_data.csv', index=False) # Save it in the format you like !

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data-howdah-1.0.3.tar.gz (7.6 kB view details)

Uploaded Source

File details

Details for the file data-howdah-1.0.3.tar.gz.

File metadata

  • Download URL: data-howdah-1.0.3.tar.gz
  • Upload date:
  • Size: 7.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for data-howdah-1.0.3.tar.gz
Algorithm Hash digest
SHA256 79cbd16bcf013635b92ac6eec688810652930746eaf6cafde0c842c941c954e3
MD5 0ae652c7c7270ddf91fc8f782cb69e75
BLAKE2b-256 f34f8ee332e94d16461d75864dcaae773c153657521f0f3ef422c7cfa53f80eb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page