Skip to main content

Anonymity library for python

Project description

PYTHON LIBRARY FOR ANONYMIZATION

This library supports the application of three classical anonymization techniques for tabular data: k-anonymity, l-diversity and t-closeness.

Installation

We recommend to use Python3 with virtualenv:

> virtualenv .venv -p python3
> source .venv/bin/activate

Then run the following command to install the library and all its requirements:

pip install python-anonymity

Documentation

The python-anonymity documentation is hosted on Read the Docs.

Getting started

Example using the crime synthetic dataset:

> import pandas as pd
> import pycanon
> from anonymity import tools
> from anonymity.tools.utils_k_anon import utils_k_anonymity as utils
> 
> d = {
>         "name": ["Joe", "Jill", "Sue", "Abe", "Bob", "Amy"],
>         "marital stat": [
>             "Separated",
>             "Single",
>             "Widowed",
>             "Separated",
>             "Widowed",
>             "Single",
>         ],
>         "age": [29, 20, 24, 28, 25, 23],
>         "ZIP code": ["32042", "32021", "32024", "32046", "32045", "32027"],
>         "crime": ["Murder", "Theft", "Traffic", "Assault", "Piracy", "Indecency"],
>     }
>     data = pd.DataFrame(data=d)
> 
>     ID = ["name"]
>     QI = ["marital stat", "age", "ZIP code"]
>     SA = ["crime"]
>     age_hierarchy = {"age": [0, 2, 5, 10]}
>     hierarchy = {
>         "marital stat": [
>             ["Single", "Not married", "*"],
>             ["Separated", "Not married", "*"],
>             ["Divorce", "Not married", "*"],
>             ["Widowed", "Not married", "*"],
>             ["Married", "Married", "*"],
>             ["Re-married", "Married", "*"],
>         ],
>         "ZIP code": [
>             ["32042", "3204*", "*"],
>             ["32021", "3202*", "*"],
>             ["32024", "3202*", "*"],
>             ["32046", "3204*", "*"],
>             ["32045", "3204*", "*"],
>             ["32027", "3202*", "*"],
>         ],
>     }
> 
>     mix_hierarchy = dict(hierarchy, **utils.create_ranges(data, age_hierarchy))

>     k = 2
>     supp_threshold = 0
>     new_data = tools.data_fly(data, ID, QI, k, supp_threshold, self.mix_hierarchy)
> 

License: Apache 2.0.

Note: the library is under heavy production, only for testing purposes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python_anonymity-0.0.1.post1.tar.gz (28.0 MB view details)

Uploaded Source

Built Distribution

python_anonymity-0.0.1.post1-py3-none-any.whl (8.4 MB view details)

Uploaded Python 3

File details

Details for the file python_anonymity-0.0.1.post1.tar.gz.

File metadata

File hashes

Hashes for python_anonymity-0.0.1.post1.tar.gz
Algorithm Hash digest
SHA256 d1364fd982cb1ac68aff8b3e90b5e482a975ef8808698fe7ba612b8e49e70ec2
MD5 1eb9ee92f0c47607b9ff3dc83f9d3bbb
BLAKE2b-256 f1c36a499159e5c8fae3939096d7bc1daf9fe4a305acbc7b7fb33d1bec1c1174

See more details on using hashes here.

File details

Details for the file python_anonymity-0.0.1.post1-py3-none-any.whl.

File metadata

File hashes

Hashes for python_anonymity-0.0.1.post1-py3-none-any.whl
Algorithm Hash digest
SHA256 7e9ace7534a579b7310a3a10555924668b0e7303e7db113af7bded6c8cd1d522
MD5 dceedff447e2991724ff0fc1025f2a4e
BLAKE2b-256 961e2aad7e56d274d13e1eda947262b389dd220d33bd5379b05370bd75f2fee6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page