Skip to main content

Local Differential Privacy for Range Queries

Project description

Hierarchical Mechanism for LDP

it is an implementation of the Hierarchical Mechanism in

Kulkarni, Tejas, Graham Cormode, and Divesh Srivastava. "Answering range queries under local differential privacy." arXiv preprint arXiv:1812.10942 (2018).

The LDP frequency protocol are implemented from the library https://github.com/Samuel-Maddock/pure-LDP

The library is intended for testing the LDP hierarchical mechanism locally.

Install

pip install hierarchical-mechanism-LDP

Usage

It is based on the class Private_Tree that implements the hierarchical mechanism for local differential privacy. The class has the following methods:

Initialization

# Initialization

from hierarchical_mechanism_LDP import Private_TreeBary
import numpy as np

B = 4000  # bound of the data, i.e., the data is in [0, B]
b = 4  # branching factor of the tree
eps = 1  # privacy budget
q = 0.4  # quantile to estimate
protocol = 'unary_encoding'  # protocol to use for LDP frequency estimation

tree = Private_TreeBary(B, b)

data = np.random.randint(0, B, 100000)  # generate random data

tree.update_tree(data, eps, protocol, post_process=True)  # update the tree with the data

The post_process boolean parameter is used to apply the post-processing step of the hierarchical mechanism. The post-processing step is used to improve the accuracy of the mechanism. The post-processing step is applied by default.

Post-processing applies the Hierarchial Mechanism proposed by Hay et al. in

M. Hay, V. Rastogi, G. Miklau, and D. Suciu. Boosting the accuracy of differentially private histograms through consistency. PVLDB, 3(1):1021–1032, 2010.

Quantile estimation

You can estimate the quantile of the data with Private_Tree.get_quantile(q), where q is the quantile to estimate.

# get quantile of the data
true_quantile = np.quantile(data, q)
private_quantile = tree.get_quantile(q)  # get the quantile
print(f"DP quantile: {private_quantile}")
print(f"True quantile: {true_quantile}")

Result

Private quantile: 1591
True quantile: 1598.0

Range Queries

You can estimate the range queries of the data with Private_Tree.get_range_query(a, b), where a and b are the bounds of the range query. Additionally, you can return a normalized range query.

left = 1000
right = 2000
true_range_query = np.sum(data >= left) - np.sum(data >= right)
private_range_query = tree.get_range_query(left, right, normalized=False)
print(f"True range query: {true_range_query}")
print(f"Private range query: {private_range_query}")

Result

True range query: 24980
Private range query: 25514.970123615636

Binning

Given a list of quantiles and an error alpha, you can create bins of the form [q_i - alpha, q_i + alpha] with Private_Tree.get_bins(quantiles, alpha). The bins are returned as a list of tuples.

# test binning
quantiles = [0.25, 0.50, 0.75]
alpha = 0.1
bins = tree.get_bins(quantiles, alpha)
print(bins)

Result

[(np.int64(664), np.int64(1441)), (np.int64(1591), np.int64(2562)), (np.int64(2669), np.int64(3365))]

Shuffling Amplification

We implemented the algorithm provided in:

Feldman, Vitaly, Audra McMillan, and Kunal Talwar. "Hiding among the clones: A simple and nearly optimal analysis of privacy amplification by shuffling." 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS). IEEE, 2022.

For analyzing the privacy amplification by shuffling of the mechanism. The implementation is taken from the repository https://github.com/apple/ml-shuffling-amplification

# Test amplification by shuffling
delta = 1e-6
shuffle_numerical = tree.get_privacy(shuffle=True, delta=delta, numerical=True)
shuffle_theoretical = tree.get_privacy(shuffle=True, delta=delta, numerical=False)
print(f"For an initial {tree.eps}-DP mechanism, after shuffling {tree.N} users and considering delta = {delta} we obtain"
      f"a numerical upper bound of eps = : {shuffle_numerical} and a theoretical upper bound of eps = {shuffle_theoretical}")

Result

For an initial 1-DP mechanism, after shuffling 100000 users and considering delta = 1e-06 
we obtain a numerical upper bound of eps = : 0.015513881255146383 
and a theoretical upper bound of eps = 0.07529011566478624

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ldp_hierarchical_mechanism-0.1.2.tar.gz (14.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ldp_hierarchical_mechanism-0.1.2-py3-none-any.whl (16.4 kB view details)

Uploaded Python 3

File details

Details for the file ldp_hierarchical_mechanism-0.1.2.tar.gz.

File metadata

File hashes

Hashes for ldp_hierarchical_mechanism-0.1.2.tar.gz
Algorithm Hash digest
SHA256 28e6086927e249ae3ecc717e82b4c2236823ee6475eb40f59d26d5ffd3ae7820
MD5 e8b1ca4958cbf48b06eb0efa028aee51
BLAKE2b-256 bce4b2397a8a55fb33baf99acf1bef60ccedb47684bc5c8e199a4d3216794933

See more details on using hashes here.

File details

Details for the file ldp_hierarchical_mechanism-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for ldp_hierarchical_mechanism-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0cc4fb4e8f5fa9835ae283694cb82e070c464528f32e3c97dcf9f31fe59d2615
MD5 4f795e8e6df667ea45f293c3d6fe4a14
BLAKE2b-256 8ea941ed9d3569e6fe0d081a8351c7544e36d7d4626299093254fdab1c4def88

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page