Evaluation of differentially private tabular data
Project description
SmartNoise Evaluator
The SmartNoise Evaluator is designed to help assess the privacy and accuracy of differentially private queries. It includes:
- Analyze: Analyze a dataset and provide information about cardinality, data types, independencies, and other information that is useful for creating a privacy pipeline
- Evaluate: Compares the privatized results to the true results and provides information about the accuracy and bias
These tools currently require PySpark.
Analyze
Analyze provides metrics about a single dataset.
- Percent of all dimension combinations that are unique, k < 5 and k < 10 (Count up to configurable “reporting length”)
- Report which columns are “most linkable”
- Marginal histograms up to n-way -- choose default with reasonable size (e.g. 10 per marginal, and up to 20 marginals -- allow override). Trim and encode labels.
- Number of rows
- Number of distinct rows
- Count, Mean, Variance, Min, Max, Median, Percentiles for each marginal
- Classification AUC
- Individual Cardinalities
- Dimensionality, Sparsity
- Independencies
Evaluate
Evaluate compares an original data file with one or more comparison files. It can compare any of the single-file metrics computed in Analyze as well as a number of metrics that involve two datasets. When more than one comparison dataset is provided, we can provide all of the two-way comparisons with the original, and allow the consumer to combine these measures (e.g. average over all datasets)
- How many dimension combinations are suppressed
- How many dimension combinations are fabricated
- How many redacted rows (fully redacted vs. partly redacted)
- Mean error in the count across categories by 1-way, 2-way, etc.
- Mean absolute error by 1-way, 2-way, etc. up to reporting length
- Also do for user specified dimension combinations
- Report by bin size (e.g., < 1000, >= 1000)
- Mean proportional error by 1-way, 2-way, etc.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file smartnoise_eval-0.3.1.tar.gz.
File metadata
- Download URL: smartnoise_eval-0.3.1.tar.gz
- Upload date:
- Size: 18.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f4a49cf7ecc50f1c8110bb259e606d5033d9198823cdb5f662dcd9c0b7233f8a
|
|
| MD5 |
cf0455d4ed772047bcd913841eee84a4
|
|
| BLAKE2b-256 |
df5235d1c6a640e7ee4e663bd243a4a2183669eb023f59136a4af7493419fb88
|
File details
Details for the file smartnoise_eval-0.3.1-py3-none-any.whl.
File metadata
- Download URL: smartnoise_eval-0.3.1-py3-none-any.whl
- Upload date:
- Size: 22.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d48c8921f944cfa8571b0db0461335b273178561f2ffba4d6e6d91b911dfe643
|
|
| MD5 |
dea231406c5bf123228938c16a0b5c0d
|
|
| BLAKE2b-256 |
dad4250ec4f1658401a8bf2595a2e7585b921304729dd72d507de1e3dface8c6
|