Skip to main content

Privacy-preserving Event Log Publishing with contextual Information

Project description

PRIPEL

PRIPEL (Privacy-preserving event log publishing with contextual information) is a framework to publish event logs that fulfill differential privacy. We provide an implementation of PRETSA in Python 3. Our code is available under the MIT license. If you use it for academic purposes please cite our paper:

@inproceedings{Fahrenkrog-Petersen20,
  author    = {Stephan A. Fahrenkrog{-}Petersen and
               Han van der Aa and
               Matthias Weidlich},
  title     = {PRIPEL:  Privacy-Preserving Event Log Publishing Including Contextual Information},
  year      = {2020},
  booktitle = {Submitted to the International Conference on Business Process Management}
}

Requirements

To run our algorithm you need the following Python packages:

We did run our algorithm only with Python 3, so we can not guarantee that it works with Python 2.

How to run PRIPEL

You can run the framework using the following command:

python pripel.py <fileName> <epsilon> <n> <k> 

The different parameters have the following meaning

  • filename: Name of event log (xes-file) that shall be anonymised
  • epsilon: Strength of the differential privacy guarantee. It must be a float
  • n: Maximum prefix of considered traces for the trace-variant-query. It must be an integer
  • k: Prunning parameter of the trace-variant-query. At least k traces must appear in a noisy variant count to be part of the result of the query. It must be an integer

The program will produce a xes-file that contains an anonymised event log.

Runtime

Please note that certain combinations of n, k and epsilon can lead to very long runtime. If you experience such a runtime, try to higher values for k. Besides that it might help to use a greedy trace matching strategy by setting the parameter of the function matchQueryToLog from the class TraceMatcher to true.

Customization

Additionally, please note that some event logs contain attributes that are equivalent to a case id. For privacy reasons such attributes must be deleted from the anonymised log. We handle such attributes with a blacklist. This blacklist is definied in the function __getBlacklistOfAttributes in tracematcher.py and attributeAnonymizier.py.

Components

pripel.py

This scripts run the overall PRIPEL-Framework. It takes in the event log (XES-File) performs the PRIPEL-based anonymisation and then saves the resulting anonmyised logs as an XES-File.

trace_variant_query.py

Performs the trace-variant query on the input log. The query is based on the algorithm described in: https://link.springer.com/article/10.1007/s12599-019-00613-3

tracematcher.py

This script mathces the cases from the input event log with the traces from the trace-variant-query. It uses standard assignment algroithm implemented in Numpy.

attributeAnonymizier.py

In this script the contextual information of the matched log is anonymised.

levenshtein.py

Contains implementation of the levenshtein-distance for traces. We use it in the tracematcher.py.

How to contact us

PRIPEL was developed at the Process-driven Architecture group of Humboldt-Universität zu Berlin. If you want to contact us, just send us a mail at: fahrenks || hu-berlin.de

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pp_pripel-0.0.4.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

pp_pripel-0.0.4-py3-none-any.whl (12.2 kB view details)

Uploaded Python 3

File details

Details for the file pp_pripel-0.0.4.tar.gz.

File metadata

  • Download URL: pp_pripel-0.0.4.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.4.2 requests/2.25.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.52.0 CPython/3.6.5

File hashes

Hashes for pp_pripel-0.0.4.tar.gz
Algorithm Hash digest
SHA256 fb75262df9e74ffc580f6592cbd6bcfb6e2e5d75abbf36275229274465cbbb05
MD5 0c67625d7554b89707cbd7e309bd8c7d
BLAKE2b-256 0c0b94b9e39df02bbd70d0dac3a3a2b673fa39a4c0c812bb78ea84fde84c4368

See more details on using hashes here.

Provenance

File details

Details for the file pp_pripel-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: pp_pripel-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 12.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.4.2 requests/2.25.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.52.0 CPython/3.6.5

File hashes

Hashes for pp_pripel-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 ef92da6d1ec156dec81e956f4379fc117ee3dd467cf9bfa10882e7c6b870aeb8
MD5 7ba5f30b15afc3796da28cde2d721f4d
BLAKE2b-256 d8c1ab7634f4f633affabfe233ea89f12da2976fb90fd6773e322036900a38d1

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page