Skip to main content

A Python package for data anonymization

Project description

DataFog

DataFog (or the product formerly known as Codexify) is a Python package that simplifies and automates data anonymization tasks. With DataFog, you can quickly scan your datasets for Personal Identifiable Information (PII), swap PII with synthetic data, save and look up original and synthetic data pairs, and more.

Note In some cases pip install will not work right off the bat; pip installing these dependencies has worked to-date:

pip install setuptools==65.5.0 "wheel<0.40.0"

Libraries Used

  • faker for Synthetic Data Generation
  • Boilerplate PII detection code (swapping in with a custom solution soon)

Coming Soon:

  • product demo
  • documentation
  • examples/ directory for more detailed examples and usage information

please see www.datafog.dev for more information or contact me at sidmohan001@gmail.com

Installation

DataFog can be installed via pip. Use the following command to install:

pip install datafog

Getting Started

Once you've installed DataFog, you can import it into your Python scripts as follows:

from datafog import DataFog

Now, you're ready to use DataFog to handle your PII anonymization needs. Here are some basic examples:

Scan a dataset for PII:

datafog = DataFog()

# Scan a csv file for PII
contains_pii, pii_fields = datafog.scan("path_to_your_file.csv")

# Print the result
print(f"Contains PII: {contains_pii}")
print(f"PII Fields: {pii_fields}")

Swap PII with synthetic data:

# Define the output path
output_path = "path_to_output_directory/"

# Swap PII in a csv file with synthetic data
datafog.swap("path_to_your_file.csv", output_path)

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datafog-1.3.8.tar.gz (7.1 kB view hashes)

Uploaded Source

Built Distribution

datafog-1.3.8-py3-none-any.whl (8.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page