Skip to main content

Evaluation of differentially private tabular data

Project description

SmartNoise Evaluator

The SmartNoise Evaluator is designed to help assess the privacy and accuracy of differentially private queries. It includes:

  • Analyze: Analyze a dataset and provide information about cardinality, data types, independencies, and other information that is useful for creating a privacy pipeline
  • Evaluate: Compares the privatized results to the true results and provides information about the accuracy and bias

These tools currently require PySpark.

Analyze

Analyze provides metrics about a single dataset.

  • Percent of all dimension combinations that are unique, k < 5 and k < 10 (Count up to configurable “reporting length”)
  • Report which columns are “most linkable”
  • Marginal histograms up to n-way -- choose default with reasonable size (e.g. 10 per marginal, and up to 20 marginals -- allow override). Trim and encode labels.
  • Number of rows
  • Number of distinct rows
  • Count, Mean, Variance, Min, Max, Median, Percentiles for each marginal
  • Classification AUC
  • Individual Cardinalities
  • Dimensionality, Sparsity
  • Independencies

Evaluate

Evaluate compares an original data file with one or more comparison files. It can compare any of the single-file metrics computed in Analyze as well as a number of metrics that involve two datasets. When more than one comparison dataset is provided, we can provide all of the two-way comparisons with the original, and allow the consumer to combine these measures (e.g. average over all datasets)

  • How many dimension combinations are suppressed
  • How many dimension combinations are fabricated
  • How many redacted rows (fully redacted vs. partly redacted)
  • Mean error in the count across categories by 1-way, 2-way, etc.
  • Mean absolute error by 1-way, 2-way, etc. up to reporting length
    • Also do for user specified dimension combinations
    • Report by bin size (e.g., < 1000, >= 1000)
  • Mean proportional error by 1-way, 2-way, etc.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smartnoise_eval-0.3.1.tar.gz (18.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

smartnoise_eval-0.3.1-py3-none-any.whl (22.6 kB view details)

Uploaded Python 3

File details

Details for the file smartnoise_eval-0.3.1.tar.gz.

File metadata

  • Download URL: smartnoise_eval-0.3.1.tar.gz
  • Upload date:
  • Size: 18.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for smartnoise_eval-0.3.1.tar.gz
Algorithm Hash digest
SHA256 f4a49cf7ecc50f1c8110bb259e606d5033d9198823cdb5f662dcd9c0b7233f8a
MD5 cf0455d4ed772047bcd913841eee84a4
BLAKE2b-256 df5235d1c6a640e7ee4e663bd243a4a2183669eb023f59136a4af7493419fb88

See more details on using hashes here.

File details

Details for the file smartnoise_eval-0.3.1-py3-none-any.whl.

File metadata

File hashes

Hashes for smartnoise_eval-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d48c8921f944cfa8571b0db0461335b273178561f2ffba4d6e6d91b911dfe643
MD5 dea231406c5bf123228938c16a0b5c0d
BLAKE2b-256 dad4250ec4f1658401a8bf2595a2e7585b921304729dd72d507de1e3dface8c6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page