Skip to main content

Python library that offers conventional anonymization techniques, utility metrics, and verification methods.

Project description

Python library offering conventional anonymization techniques, utility metrics, and verification methods.

Installation

You can install the package via the command line:

pip install anonymity-api

Package Description

The package is divided into three different modules, as mentioned previously:

  • anonymity
  • utility
  • verify

Anonymity

This module contains the functions to anonymize data.

The conventional anonymization functions available are:

  • k-anonymity
  • distinct l-diversity
  • entropy l-diversity
  • recursive (c,l)-diversity
  • t-closeness

Another available function is the suggestion function that, given a dataset and its characteristics (list with quasi-identifiers and sensitive attributes), suggests an anonymization to use, returning an anonymized dataset without choosing a technique. This is helpful for users who may not know how to anonymize data or aren't familiar with it.

We also offer Workload-Aware anonymization techniques. These take the usual anonymization parameters also present in the conventional anonymization techniques but, in addition to that, the user can give a query representing to work to be done on the dataset. This ensures higher utility over the tasks to be done.

Technique Query
Simple query (quasi-identifier (operation) value )
Keeping correlation corr( quasi-identifier, sensitive-attribute )
Grouping group( quasi-identifier, value )

*operation can be: >, >=, =, < or <=

Utility

The utility module offers some utility techniques and a function that given an anonymized dataset, replaces the interval on the quasi-identifiers with a value comprehended in it.

The utility metrics available are:

  • Discernibility Metric
  • Average Equivalence Class Size Metric
  • Normalized Certainty Penalty

Verifier

This module given an anonymized dataset offers funtions for each of the conventional techniques in the Anonymization module.

These functions will say which parameter was used to anonymize the dataset. For instance, it would give the K for k-anonymity, l for distinct l-diversity, and so forth.

Anonymization Example

x y z
1 4 7
2 5 3
6 1 6
4 2 2

To anonymize the dataset above through k-anonymity, the following code would be valid:

import pandas as pd # necessary to create a dataframe
from anonymity_api import anonymity

# read the csv with data into a pandas dataframe
dataframe = pd.read_csv("data.csv") 

# We are anonymizing with k-anonymity, passing "x", and "y" as quasi-identifiers and "z" as a sensitive attribute
# And a value for k = 2
anonymized_data = anonymity.k_anonymity(data = dataframe, quasi_idents = ["x", "y"], k = 2, idents = ["z"])

The resulting dataset from this anonymization would be:

x y
1 - 2 4 - 5
1 - 2 4 - 5
4 - 6 1 - 2
4 - 6 1 - 2

For each tuple, there is at least one other tuple with the same values complying with our k of 2. The sensitive attribute z was removed.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anonymity_api-1.0.6.tar.gz (13.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

anonymity_api-1.0.6-py3-none-any.whl (25.1 kB view details)

Uploaded Python 3

File details

Details for the file anonymity_api-1.0.6.tar.gz.

File metadata

  • Download URL: anonymity_api-1.0.6.tar.gz
  • Upload date:
  • Size: 13.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.7.3

File hashes

Hashes for anonymity_api-1.0.6.tar.gz
Algorithm Hash digest
SHA256 8f808a126f55eaee8ab36881a3eb90fbae264e9f63270c45f297a515b9c68050
MD5 bbaf218f1b740677e4681eeab78d0704
BLAKE2b-256 25c5161e3fa76a0554f3b301024aec53379d01b4c65a96955e988e92cead5ea0

See more details on using hashes here.

File details

Details for the file anonymity_api-1.0.6-py3-none-any.whl.

File metadata

File hashes

Hashes for anonymity_api-1.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 323e16fd534817d2220a66ba180ae49d44be6abdcbc068daaad07eb33b77c513
MD5 017b029d431cdfe95598984b6642dc50
BLAKE2b-256 a1698d9b7a2c37471a9dc47d7e5e384317020a9b0bbfa2722e2c9ea6f0a6c051

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page