Skip to main content

Local differential privacy mechanisms

Project description

License: Apache 2.0 codecov PyPI Documentation Status Publish Package in PyPI CI/CD Pipeline Code Coverage Python version

TrasgoDP implements different mechanims for ε-differential privacy and (ε, δ)-differential privacy. The mechanisms are implemented for being used under a local approach, adding noise directly to the raw data. Two types of mechanims are implemented:

  • For numerical records: Laplace and Gaussian mechanisms. The implementation includes a final clipping applyied on the data with DP.
  • For categorical records: Exponential mechanism and Randomized Response (both for binary attributes and the k-ary version).

This library provides dedicated function designed for being applied on both pandas dataframes and lists/numpy arrays.

Installation

You can install trasgoDP using pip. We recommend to use Python3 with virtualenv:

virtualenv .venv -p python3
source .venv/bin/activate
pip install trasgoDP

Mechanisms implemented

Mechanism Type of the attribute Function in trasgoDP
Laplace Numerical numerical.dp_clip_laplace()
Gaussian Numerical numerical.dp_clip_gaussian()
Exponential Categorical categorical.dp_exponential()
Randomized response Categorical (binary) categorical.dp_randomized_response_binary()
k-ary randomized response Categorical categorical.dp_randomized_response_kary()

Getting started

For applying DP mechanisms to a column of a dataframe you need to introduce:

  • The pandas dataframe with the data.
  • The column in the dataframe to be privatized.
  • The privacy budget (ε).
  • The probability of exceeding the privacy budget (δ) in case of numerical attributes and the Gaussian mechanism.
  • The uper and lower bounds for numerical attributes (optional).

Example: apply DP to the adult dataset with the Laplace mechanism for the column age and the Exponential mechanism for the column workclass:

import pandas as pd
from trasgodp.numerical import dp_clip_laplace
from trasgodp.categorical import dp_exponential

# Read and process the data
data = pd.read_csv("examples/adult.csv")
data.columns = data.columns.str.strip()
cols = [
    "workclass",
    "education",
    "marital-status",
    "occupation",
    "sex",
    "native-country",
]
for col in cols:
    data[col] = data[col].str.strip()

# Apply DP for the attribute age:
column_num = "age"
epsilon1 = 10
df = dp_clip_laplace(data, column_num, epsilon1, new_column=True)

# Apply DP for the attribute workclass:
column_cat = "workclass"
epsilon2 = 5
df = dp_exponential(data, column_cat, epsilon2, new_column=True)

Warning

This project is under active development.

License

This project is licensed under the Apache 2.0 license.

Related work

If you are using trasgoDP, you may also be interested in:

  • pyCANON: a Python library for checking the level of anonymity of a dataset.
  • anjana: a Python library for anonymizing tabular datasets.

Funding and acknowledgments

This work is funded by European Union through the SIESTA project (Horizon Europe) under Grant number 101131957.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trasgodp-0.3.0.tar.gz (12.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

trasgodp-0.3.0-py3-none-any.whl (18.6 kB view details)

Uploaded Python 3

File details

Details for the file trasgodp-0.3.0.tar.gz.

File metadata

  • Download URL: trasgodp-0.3.0.tar.gz
  • Upload date:
  • Size: 12.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for trasgodp-0.3.0.tar.gz
Algorithm Hash digest
SHA256 ac69e498cc3d2e6b47955d979a4078071ebca3fdcc97716307dca219196b805d
MD5 07fd77c6e0445a45b3e2a67d20ef35f6
BLAKE2b-256 4bb3b78c67cec3c818265864f662498dd374d83e4eb2cbb417076f443bf1ab5c

See more details on using hashes here.

File details

Details for the file trasgodp-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: trasgodp-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 18.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for trasgodp-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ac34f588503e65d84cc97a6a008c616ee9fca852245cccc6c3b458962b7393b0
MD5 d9e6417e8c56ace0106782c665c78cf6
BLAKE2b-256 b3708eb7cefd5774ef8c7fe5ae5762df47d51fb297af21377d6a6644cf1bf41c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page