Skip to main content

This library contains a collection of privacy preservation techniques.

Project description

PETINA: Privacy prEservaTIoN Algorithms

Python PyPI version License: MIT Downloads

PETINA is a general-purpose Python library for Differential Privacy (DP), designed for flexibility, modularity, and extensibility across a wide range of ML and data processing pipelines. It supports both numerical and categorical data, with tools for supervised and unsupervised tasks.

TODO:

  • Add 3 MA from Opals
  • Noise multiplier will need to get the epsilon value Opacus

Features

PETINA includes state-of-the-art tools for:

Differential Privacy (DP) Mechanisms

  • Laplace Mechanism
  • Gaussian Mechanism
  • Renyi-Gaussian Mechanism
  • Exponential Mechanism
  • Flip Coin Mechanism
  • Pruning
  • Pruning Adaptive
  • Pruning DP
  • Count Sketch

Clipping

  • Adaptive Clipping
  • Clipping
  • Clipping DP

Encoding and Pertubation

  • Pertubation
  • Aggregation & Estimation
  • Parameter Utilities
  • Encoding

Data Conversion Helper

  • Flatten NumPy array to list and get shape
  • Reshape list to NumPy array with shape
  • Flatten PyTorch tensor to list and get shape
  • Reshape list to PyTorch tensor with shape
  • Detect input type and flatten to list with shape
  • Convert list back to original data type and shape

Quick Start

Below is a real world example when adding noise to age of various person

from PETINA import DP_Mechanisms, Encoding_Pertubation, Clipping, Pruning
import numpy as np
import random

# --- Real-world data: Users' ages from a survey ---
user_ages = [23, 35, 45, 27, 31, 50, 29, 42, 38, 33]
print("Original ages:", user_ages)

# --- DP parameters ---
sensitivity = 1  # Age changes by 1 at most for neighboring datasets
epsilon = 0.5    # Moderate privacy budget
delta = 1e-5
gamma = 0.001

# --- Add Laplace noise to ages ---
noisy_ages = DP_Mechanisms.applyDPLaplace(user_ages, sensitivity, epsilon)
print("\nNoisy ages with Laplace Mechanism:")
print(noisy_ages)

# --- Encode noisy ages using Unary Encoding ---
p = Encoding_Pertubation.get_p(epsilon)
q = Encoding_Pertubation.get_q(p, epsilon)
encoded_ages = Encoding_Pertubation.unaryEncoding(noisy_ages, p=p, q=q)
print("\nUnary encoded noisy ages:")
print(encoded_ages)

# --- Summary ---
print("\nSummary:")
print(f"Original ages: {user_ages}")
print(f"Noisy ages: {np.round(noisy_ages, 2)}")
#------OUTPUT------
# Original ages: [23, 35, 45, 27, 31, 50, 29, 42, 38, 33]

# Noisy ages with Laplace Mechanism:
# [21.46703958 34.93585449 47.36478841 25.68077936 30.11460444 49.3448666
#  28.8128474  36.54981691 37.6103979  33.32033856]

# Unary encoded noisy ages:
# [(33.320338556461415, np.float64(14.023220368761203)), (34.935854491045006, np.float64(5.97677963123879)), (36.54981690878978, np.float64(22.06966110628362)), (37.61039790139999, np.float64(-10.116101843806039)), (47.36478841495265, np.float64(-18.162542581328452)), (49.34486659855414, np.float64(14.023220368761203)), (21.467039579955127, np.float64(-18.162542581328452)), (25.6807793619914, np.float64(-2.069661106283625)), (28.812847396103876, np.float64(5.97677963123879)), (30.114604444236978, np.float64(-10.116101843806039))]

# Summary:
# Original ages: [23, 35, 45, 27, 31, 50, 29, 42, 38, 33]
# Noisy ages: [21.47 34.94 47.36 25.68 30.11 49.34 28.81 36.55 37.61 33.32]

We also provide hands-on examples in the examples folder.

  • Example 1: Basic PETINA Usage.
  • Example 2: This example demonstrates how to perform frequency estimation on synthetic categorical data using various pure Local Differential Privacy (LDP) algorithms. It also compares PETINA's Count Mean Sketch (CMS) and CSVec-based sketching with baseline LDP methods and a centralized CMS variant.
  • Example 3: This script demonstrates differentially private training in PyTorch using PETINA with budget tracking, clipping, and noise injection.
  • Example 4: This script demonstrates how to apply PETINA's Count Sketch mechanism to lists, NumPy arrays, and PyTorch tensors for efficient data approximation.
  • Example 5: This script runs a federated learning simulation with optional Laplace, Gaussian, or Count Sketch-based privacy mechanisms, integrated with PETINA's budget accountant to track and gracefully handle differential privacy budget consumption.

Installation

  • Install from PyPI
pip install PETINA
  • Install from Source
git clone https://github.com/ORNL/PETINA.git
cd PETINA
pip install -e .

Citing PETINA

If you use PETINA in your research, please cite the official DOE OSTI release:

https://www.osti.gov/doecode/biblio/149859

@misc{ doecode_149859,
  title = {ORNL/PETINA},
  author = {Kotevska, Ole and Nguyen, Duc},
  abstractNote = {This is a library that has implementation of privacy preservation algorithms.},
}

Contributors

We welcome community contributions to PETINA.

For major changes, please open an issue first. For small fixes or enhancements, submit a pull request. Include/update tests where applicable.

Contact: KOTEVSKAO@ORNL.GOV

License

This project is licensed under the MIT License.

Acknowledgements

This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research under Contract No. DE-AC05-00OR22725. This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

petina-0.0.17-py3-none-any.whl (123.0 kB view details)

Uploaded Python 3

File details

Details for the file petina-0.0.17-py3-none-any.whl.

File metadata

  • Download URL: petina-0.0.17-py3-none-any.whl
  • Upload date:
  • Size: 123.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for petina-0.0.17-py3-none-any.whl
Algorithm Hash digest
SHA256 b49e134ac3cefeca5d74dc0f6f20e099085e471da668e79405d461b8c766024d
MD5 0dab879761b6b3648560ca81958d3fdc
BLAKE2b-256 67d6135a9c5c0c617576ed6f0057d6fbcf84458dc1ba8e55245fcf2a010c55a7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page