algorithms for process mining and data mining on event sequences
Project description
Prolothar Process Discovery
Algorithms to discover process behavior from data mining on sequential data such as process logs.
Based on the publication
Boris Wiegand, Dietrich Klakow, and Jilles Vreeken. Mining easily understandable models from complex event logs. In: Proceedings of the SIAM International Conference on Data Mining (SDM), Virtual Event. 2021, pp. 244-252.
Prerequisites
Python 3.11+
Usage
If you want to run the algorithms on your own data, follow the steps below.
Installing
pip install prolothar-process-discovery
Creating or reading an EventLog
Option 1: you can create an EventLog from a pandas dataframe
# 1) there must be a header line
# 2) each line belongs to one event
# 3) there is one column containing the case ID
# 4) there is one column containing the activity name of the event
# 5) there can be columns for trace and event attributes
import pandas as pd
eventlog = EventLog.create_from_pandas_df(
pd.read_csv('path/to/eventlog.csv', delimiter=','),
'CaseId', 'Activity',
trace_attribute_columns=['Customer'],
event_attribute_columns=['Duration']
)
Option 2: you can create an EventLog from .xes with the help of the pm4py package
from pm4py.objects.log.importer.xes import importer as xes_import_factory
import prolothar_common.pm4py_utils as pm4py_utils
xes = xes_import_factory.apply('path/to/eventlog.xes.gz')
eventlog = pm4py_utils.convert_pm4py_log(xes)
Option 3: you can create an EventLog manually
from prolothar_common.models.eventlog import EventLog, Trace, Event
eventlog = EventLog()
#case ID (0 in the example) can be any hashable type, e.g. int or string. must be unique.
eventlog.add_trace(Trace(0, [
Event('start computer', attributes={'user': 'alice'}),
Event('drink coffee', attributes={'milk': 'yes', 'grams_of_sugar': 5}),
]))
Discovering a PatternGraph
from prolothar_process_discovery.discovery import Proseqo
from prolothar_process_discovery.discovery import ProSimple
directly_follows_graph = PatternGraph.create_from_event_log(eventlog)
pattern_graph = Proseqo().mine_dfg(eventlog, directly_follows_graph, verbose=True)
pattern_graph.plot()
pattern_graph = ProSimple().mine_dfg(eventlog, directly_follows_graph, verbose=True)
# we can also plot to a file
pattern_graph.plot(filepath='path/to/your/file', filetype: str='pdf', view=False)
Development
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.
Additional Prerequisites
- make (optional)
Running the tests
make test
Deployment
make clean_package || make package && make publish
Versioning
We use SemVer for versioning.
Authors
If you have any questions, feel free to ask one of our authors:
- Boris Wiegand - boris.wiegand@stahl-holding-saar.de
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file prolothar-process-discovery-6.0.0.tar.gz
.
File metadata
- Download URL: prolothar-process-discovery-6.0.0.tar.gz
- Upload date:
- Size: 36.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 163986bc6690fcc8524e435f0d9c4a0baee552eb5633a03481eaa12cb364e143 |
|
MD5 | ed9561147543137c54745760ac62d4c3 |
|
BLAKE2b-256 | c7aae577c1ae910e4ce855f00d58a496f40b89b152c6ad4641df32307192b540 |
File details
Details for the file prolothar_process_discovery-6.0.0-cp311-cp311-win_amd64.whl
.
File metadata
- Download URL: prolothar_process_discovery-6.0.0-cp311-cp311-win_amd64.whl
- Upload date:
- Size: 2.0 MB
- Tags: CPython 3.11, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 734f18bf36e8c584a9ec2f86baad09922ed79fdd193e963059135971cf2943eb |
|
MD5 | ec9dcb41e59f5abf5171e38dc40e559d |
|
BLAKE2b-256 | 31d91f978cebdb64bc8574b9cf90e9b401185593c17d2718144fe7e7fbec30bb |