Skip to main content

Simple pure LDP frequency oracle implementations

Project description

Pure-LDP

pure-LDP is a Python package that provides simple implementations of various state-of-the-art LDP algorithms (both Frequency Oracles and Heavy Hitters) with the main goal of providing a single, simple interface to use these algorithms.

pure-LDP started as a package for pure LDP frequency oracles detailed in the paper "Locally Differentially Private Protocols for Frequency Estimation" by Wang et al

The package has implementations of all three main frequency oracles detailed in that paper:

  1. (Optimal) Unary Encoding - Under pure_ldp.frequency_oracles.unary_encoding
  2. (Summation/Thresholding) Histogram encoding - Under pure_ldp.frequency_oracles.histogram_encoding
  3. (Optimal) Local Hashing - Under pure_ldp.frequency_oracles.local_hashing

The package also includes an implementation of the heavy hitter algorithm Prefix Extending Method (PEM)

  • This is under pure_ldp.heavy_hitters.prefix_extending

The package also contains other LDP implementations:

  1. Apple's Count Mean Sketch (CMS / HCMS) Algorithm - This is under pure_ldp.frequency_oracles.apple_cms
  2. Hadamard Response (HR) - This is under pure_ldp.frequency_oracles.hadamard_responsethe code implemented for this is simply a pure-LDP wrapper of the repo hadamard_response

Installation

Use the package manager pip to install.

pip install pure-ldp

To upgrade to the latest version

pip install pure-ldp --upgrade

Requires numpy, scipy, xxhash, bitarray and bitstring

Usage

import numpy as np
from pure_ldp.frequency_oracles.local_hashing import LHClient, LHServer

# Using Optimal Local Hashing (OLH)

epsilon = 3 # Privacy budget of 3
d = 4 # For simplicity, we use a dataset with 4 possible data items

client_olh = LHClient(epsilon=epsilon, d=d, use_olh=True)
server_olh = LHServer(epsilon=epsilon, d=d, use_olh=True)

# Test dataset, every user has a number between 1-4, 10,000 users total
data = np.concatenate(([1]*4000, [2]*3000, [3]*2000, [4]*1000))

for item in data:
    # Simulate client-side privatisation
    priv_data = client_olh.privatise(item)

    # Simulate server-side aggregation
    server_olh.aggregate(priv_data)

# Simulate server-side estimation
print(server_olh.estimate(1)) # Should be approximately 4000 +- 200

See example.py for more examples.

TODO

  1. Implementation of Apple's SFP
  2. Implementation of Google's RAPPOR
  3. Implementation of frequency oracles/heavy hitter algorithms detailed in
  4. Better documentation !

Acknowledgements

  1. Some OLH code is based on the implementation by Tianhao Wang: repo
  2. The Hadamard Response code is just a wrapper of the k2khadamard.py code in the repo hadamard_response by Ziteng Sun

Contributing

If you feel like this package could be improved in any way, open an issue or make a pull request!

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pure-ldp-1.0.6.tar.gz (20.5 kB view details)

Uploaded Source

Built Distribution

pure_ldp-1.0.6-py3-none-any.whl (30.4 kB view details)

Uploaded Python 3

File details

Details for the file pure-ldp-1.0.6.tar.gz.

File metadata

  • Download URL: pure-ldp-1.0.6.tar.gz
  • Upload date:
  • Size: 20.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for pure-ldp-1.0.6.tar.gz
Algorithm Hash digest
SHA256 6b9808b429a9b614a9d211943106529f6f6b3288280317246a4337d5302ef5a1
MD5 38f68f1449979033527f842409c15a48
BLAKE2b-256 495e9ba7be9ab8e4ff4ad5506d0352ff6bfce88ebd48bc641be84698e6140c4e

See more details on using hashes here.

File details

Details for the file pure_ldp-1.0.6-py3-none-any.whl.

File metadata

  • Download URL: pure_ldp-1.0.6-py3-none-any.whl
  • Upload date:
  • Size: 30.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for pure_ldp-1.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 011fda6452620549350e158d029d5bd81e3da1f04c95d556b9b0a158f715507f
MD5 c7eda0323d0303284e5c8e7ee0bb9766
BLAKE2b-256 8755457fd67170939a8e93abf04fe94d4165c7a3ed176e62e33cfe628989b5c0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page