Local Differential Privacy for Range Queries
Project description
Hierarchical Mechanism for LDP
it is an implementation of the Hierarchical Mechanism in
Kulkarni, Tejas, Graham Cormode, and Divesh Srivastava. "Answering range queries under local differential privacy." arXiv preprint arXiv:1812.10942 (2018).
The LDP frequency protocol are implemented from the library https://github.com/Samuel-Maddock/pure-LDP
The library is intended for testing the LDP hierarchical mechanism locally.
Install
pip install hierarchical-mechanism-LDP
Usage
It is based on the class Private_Tree that implements the hierarchical mechanism for local differential privacy.
Initialization
# Initialization
from hierarchical_mechanism_LDP import Private_TreeBary
import numpy as np
N = 100_000
B = 1e6 # bound of the data, i.e., the data is in [0, B]
b = 5 # branching factor of the tree
eps = 5. # privacy budget
q = 0.4 # quantile to estimate
data = np.random.randint(0, B, N) # generate random data
tree = Private_TreeBary(B, b, eps, on_all_levels=True)
tree.update_tree(data) # update the tree with the data
With the Private_Tree class, you can create a hierarchical mechanism with the following parameters:
B: the bound of the data, i.e., the data is in [0, B]b: the branching factor of the treeeps: the privacy budget used on the frequency protocolon_all_levels: boolean parameter that decides if the users reports to all levels or they sample a random level to report. The default value isTrue.
Other parameters of Private_Tree are:
protocol: It set to default value "unary encoding" for unary encoding protocol for small dimesions. At large dimension >10_000 the protocol is set to be "Hadamard Randomized Response".
Additional Features
tree.compute_attributes() # compute the attributes of the tree and delete the frequency protocol
tree.post_process() # It post-process the tree attributes and compute the entire cdf.
With the compute_attributes method, all the frequency oracles are used to estimate the tree attributes. The tree attributes are the number of users in each node of the tree. The frequency protocol is deleted after the computation of the attributes.
The post_process method is used to apply the post-processing step of the hierarchical mechanism.
The post-processing step is used to improve the accuracy of the mechanism.
Post-processing applies the Hierarchial Mechanism proposed by Hay et al. in
M. Hay, V. Rastogi, G. Miklau, and D. Suciu. Boosting the accuracy of differentially private histograms through consistency. PVLDB, 3(1):1021–1032, 2010.
Quantile estimation
You can estimate the quantile of the data with Private_Tree.get_quantile(q), where q is the quantile to estimate.
# get quantile of the data
true_quantile = np.quantile(data, q)
private_quantile = tree.get_quantile(q) # get the quantile
print(f"DP quantile: {private_quantile}")
print(f"True quantile: {true_quantile}")
Result
DP quantile: 401115
True quantile: 400061
If the tree has been post-processed, the quantile estimation is more accurate and is computed in the entire cdf. This might be however space inefficient for large dimensions.
If the tree has not been post-processed, the quantile is estimated using a bary search starting from the root. At each node a frequency (absolute if on all levels is True or relative if on all levels is False) is estimated
using the frequency servers stored in the data structure.
Range Queries
You can estimate the range queries of the data with Private_Tree.get_range_query(a, b), where a and b are the bounds of the range query.
Additionally, you can return a normalized range query.
left = 1000
right = 2000
true_range_query = np.sum(data >= left) - np.sum(data >= right)
private_range_query = tree.get_range_query(left, right, normalized=False)
print(f"True range query: {true_range_query}")
print(f"Private range query: {private_range_query}")
Result
True range query: 105
Private range query: 144.79785114986325
Binning
Given a list of quantiles and an error alpha, you can create bins of the form
[q_i - alpha, q_i + alpha] with Private_Tree.get_bins(quantiles, alpha). The bins
are returned as a list of tuples.
# test binning
quantiles = [0.25, 0.50, 0.75]
alpha = 0.1
bins = tree.get_bins(quantiles, alpha)
print(bins)
Result
[(147500, 352125), (401115, 606250), (651260, 854155)]
Shuffling Amplification
We implemented the algorithm provided in:
Feldman, Vitaly, Audra McMillan, and Kunal Talwar. "Hiding among the clones: A simple and nearly optimal analysis of privacy amplification by shuffling." 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS). IEEE, 2022.
For analyzing the privacy amplification by shuffling of the mechanism. The implementation is taken from the repository https://github.com/apple/ml-shuffling-amplification
# Test amplification by shuffling
delta = 1e-6
shuffle_numerical, method_numerical = tree.get_privacy(shuffle=True, delta=delta, numerical=True)
shuffle_theoretical, method_numerical = tree.get_privacy(shuffle=True, delta=delta, numerical=False)
print(
f"For an initial {tree.eps * (tree.depth -1)}-DP mechanism, after shuffling {tree.N} users and considering delta = {delta} we obtain\n"
f"a numerical upper bound of eps = : {shuffle_numerical} using {method_numerical} \na theoretical upper bound "
f"of eps = {shuffle_theoretical} using {method_numerical}"
)
Result
For an initial 45.0-DP mechanism, after shuffling 100000 users and considering delta = 1e-06 we obtain
a numerical upper bound of eps = : 3.2028258434821195 using pure composition
a theoretical upper bound of eps = 7.436947083278209 using pure composition
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ldp_hierarchical_mechanism-0.1.3.tar.gz.
File metadata
- Download URL: ldp_hierarchical_mechanism-0.1.3.tar.gz
- Upload date:
- Size: 15.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f433ca7942de21ec3a443eea664672f121ecf679d9e4aea32aa30e7e5eb3ffd1
|
|
| MD5 |
d87a7b9dc945db3fb33dc22135735af6
|
|
| BLAKE2b-256 |
b25f85e8dcb04b34920ace2e8633ee8417d1259c30ff39a3efa2c226d88a9d93
|
File details
Details for the file ldp_hierarchical_mechanism-0.1.3-py3-none-any.whl.
File metadata
- Download URL: ldp_hierarchical_mechanism-0.1.3-py3-none-any.whl
- Upload date:
- Size: 17.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ff02434eb2ebad792c6570717cffae76d7f5957aac75adccf3a0aeaf04f0600b
|
|
| MD5 |
ecaf1a7c8ca26621fb29cc6b14ef8031
|
|
| BLAKE2b-256 |
e48297cf941463f4ede00b05d42011b9cc2a009384eec4faf2fc412fc543102a
|