Differentially Private SQL Queries
Project description
SmartNoise SQL
Differentially private SQL queries. Tested with:
- PostgreSQL
- SQL Server
- Spark
- Pandas (SQLite)
- PrestoDB
SmartNoise is intended for scenarios where the analyst is trusted by the data owner. SmartNoise uses the OpenDP library of differential privacy algorithms.
Installation
pip install smartnoise-sql
Querying a Pandas DataFrame
Use the from_df
method to create a private reader that can issue queries against a pandas dataframe.
import snsql
from snsql import Privacy
import pandas as pd
privacy = Privacy(epsilon=1.0, delta=0.01)
csv_path = 'PUMS.csv'
meta_path = 'PUMS.yaml'
pums = pd.read_csv(csv_path)
reader = snsql.from_df(pums, privacy=privacy, metadata=meta_path)
result = reader.execute('SELECT sex, AVG(age) AS age FROM PUMS.PUMS GROUP BY sex')
Querying a SQL Database
Use from_connection
to wrap an existing database connection.
import snsql
from snsql import Privacy
import psycopg2
privacy = Privacy(epsilon=1.0, delta=0.01)
meta_path = 'PUMS.yaml'
pumsdb = psycopg2.connect(user='postgres', host='localhost', database='PUMS')
reader = snsql.from_connection(pumsdb, privacy=privacy, metadata=meta_path)
result = reader.execute('SELECT sex, AVG(age) AS age FROM PUMS.PUMS GROUP BY sex')
Communication
- You are encouraged to join us on GitHub Discussions
- Please use GitHub Issues for bug reports and feature requests.
- For other requests, including security issues, please contact us at smartnoise@opendp.org.
Releases and Contributing
Please let us know if you encounter a bug by creating an issue.
We appreciate all contributions. Please review the contributors guide. We welcome pull requests with bug-fixes without prior discussion.
If you plan to contribute new features, utility functions or extensions to this system, please first open an issue and discuss the feature with us.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for smartnoise_sql-0.2.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f378d62c93ff46ae73eaacd476fd83c9eee1c2c4613684f194e9b4b4b4ab69c2 |
|
MD5 | 942d428702706b3534b7224e3f952dad |
|
BLAKE2b-256 | 9c8f6f808fcbb6731ea64b701243d6c268dd72fc562831ad95b591b427d9d058 |