Skip to main content

Private Accurate Combination (PAC) Synthesizers

Project description

Private Accurate Combination (PAC) Synthesizers

Library to generate synthetic data for privacy-preserving data sharing and analysis.

Python library exposing a set of synthesizers based on the Synthetic Data Showcase project.

These synthesizers aim to replicate the counts of attribute combinations in a sensitive dataset while maintaining differential privacy.

Available synthesizers

  • DpAggregateSeededSynthesizer: a differentially-private synthesizer that relies on DP Marginals to build synthetic data. It will compute DP Marginals (called aggregates) for your dataset up to and including a specified reporting length, and synthesize data based on the computed aggregated counts.

For more information about the DP approach please refer to the DP documentation on SDS.

Installation

pip install pac-synth

If there are no pre-built wheels for your system, you will need Rust tooling installed, so it can be compiled locally.

Using

Check our detailed and short notebook examples for more information.

from pacsynth import Dataset, DpAggregateSeededParametersBuilder, DpAggregateSeededSynthesizer
from utils import gen_data_frame

# this generates a random pandas data frame with 5000 records
# replace this with your own data
sensitive_df = gen_data_frame(5000)
dataset = Dataset.from_data_frame(sensitive_df)

# build synthesizer
synth = DpAggregateSeededSynthesizer(
	DpAggregateSeededParametersBuilder().epsilon(0.5).build()
)
synth.fit(dataset)

# sample 5000 records and build a data frame
synthetic_raw_data = synth.sample(5000)
synthetic_df = Dataset.raw_data_to_data_frame(synthetic_raw_data)

# show 10 example records
print(synthetic_df.sample(10))

# this will output
#      H1 H2  H3 H4 H5 H6 H7 H8 H9 H10
# 1858  2  2   2  1  1  1  1  1  1   1
# 4218     4  10
# 2346  2  4   6  1  1  1  1  1  1   1
# 3594  1  6   1
# 4059  2  6   6
# 2042  2  3   1  1  1  1  1  1  1   1
# 4546        10
# 2443  2  4   8  1  1  1  1  1  1   1
# 831   1  4   6  1  1  1  1  1  1   1
# 20    1  1   1  1  1  1  1  1  1   1

License

MIT License

Copyright (c) Microsoft Corporation.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Contact

Feedback and suggestions are welcome via email to sds-team@microsoft.com.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pac_synth-0.0.8.tar.gz (100.6 kB view details)

Uploaded Source

Built Distributions

pac_synth-0.0.8-cp37-abi3-win_amd64.whl (486.9 kB view details)

Uploaded CPython 3.7+ Windows x86-64

pac_synth-0.0.8-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.7+ manylinux: glibc 2.17+ x86-64

pac_synth-0.0.8-cp37-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl (1.3 MB view details)

Uploaded CPython 3.7+ macOS 10.9+ universal2 (ARM64, x86-64) macOS 10.9+ x86-64 macOS 11.0+ ARM64

File details

Details for the file pac_synth-0.0.8.tar.gz.

File metadata

  • Download URL: pac_synth-0.0.8.tar.gz
  • Upload date:
  • Size: 100.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/0.14.15

File hashes

Hashes for pac_synth-0.0.8.tar.gz
Algorithm Hash digest
SHA256 82dbe42b2ab59464f6e672edd31336902f2571e8f37bca7b569e2b6eef22e027
MD5 c94c93d159d163f2c43e795b43b47ec5
BLAKE2b-256 8a2a0774b1efc51d2a612d63d9a390f5e5179c820caebcc49e17aae06af43e80

See more details on using hashes here.

File details

Details for the file pac_synth-0.0.8-cp37-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for pac_synth-0.0.8-cp37-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 fbe1fc338ac2eb4e80fa8db3fceb2aed1be24bf4fe47c5293561b3314a358f8e
MD5 896db2f71ef16704316be60b1ad8b1e7
BLAKE2b-256 fdec3be3e33276ac57e3eeb3efac0eafd81a0fd074286edbd415f7677072f1ed

See more details on using hashes here.

File details

Details for the file pac_synth-0.0.8-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pac_synth-0.0.8-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8fc9c15e897ff4b2dc00bad9c37994c117327d741852781cae56e5f40fe7498a
MD5 1aa5fb24ce765af1a9e2b420a8aff9af
BLAKE2b-256 75e7a74743254689b1144c7b30d6920b1642f8d4605c127d0acf5d7231035e9a

See more details on using hashes here.

File details

Details for the file pac_synth-0.0.8-cp37-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for pac_synth-0.0.8-cp37-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 96f22d653b13da289fe76086d938e124881c5923cd49bbabd8198bdd2a1cbfae
MD5 be3b6c591742ad8500ef060a2c23120f
BLAKE2b-256 a1d63b4a024e4409c76c48d85e610b804fff71e71d6283dd4c7161c717b72be2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page