Skip to main content

algorithms for process mining and data mining on event sequences

Project description

Prolothar Process Discovery

Algorithms to discover process behavior from data mining on sequential data such as process logs.

Based on the publication

Boris Wiegand, Dietrich Klakow, and Jilles Vreeken. Mining easily understandable models from complex event logs. In: Proceedings of the SIAM International Conference on Data Mining (SDM), Virtual Event. 2021, pp. 244-252.

Prerequisites

Python 3.11+

Usage

If you want to run the algorithms on your own data, follow the steps below.

Installing

pip install prolothar-process-discovery

Creating or reading an EventLog

Option 1: you can create an EventLog from a pandas dataframe

# 1) there must be a header line
# 2) each line belongs to one event
# 3) there is one column containing the case ID
# 4) there is one column containing the activity name of the event
# 5) there can be columns for trace and event attributes
import pandas as pd
eventlog = EventLog.create_from_pandas_df(
      pd.read_csv('path/to/eventlog.csv', delimiter=','),
      'CaseId', 'Activity',
      trace_attribute_columns=['Customer'],
      event_attribute_columns=['Duration']
)

Option 2: you can create an EventLog from .xes with the help of the pm4py package

from pm4py.objects.log.importer.xes import importer as xes_import_factory
import prolothar_common.pm4py_utils as pm4py_utils
xes = xes_import_factory.apply('path/to/eventlog.xes.gz')
eventlog = pm4py_utils.convert_pm4py_log(xes)

Option 3: you can create an EventLog manually

from prolothar_common.models.eventlog import EventLog, Trace, Event
eventlog = EventLog()
#case ID (0 in the example) can be any hashable type, e.g. int or string. must be unique.
eventlog.add_trace(Trace(0, [
      Event('start computer', attributes={'user': 'alice'}),
      Event('drink coffee', attributes={'milk': 'yes', 'grams_of_sugar': 5}),
]))

Discovering a PatternGraph

from prolothar_process_discovery.discovery import Proseqo
from prolothar_process_discovery.discovery import ProSimple

directly_follows_graph = PatternGraph.create_from_event_log(eventlog)

pattern_graph = Proseqo().mine_dfg(eventlog, directly_follows_graph, verbose=True)
pattern_graph.plot()

pattern_graph = ProSimple().mine_dfg(eventlog, directly_follows_graph, verbose=True)
# we can also plot to a file
pattern_graph.plot(filepath='path/to/your/file', filetype: str='pdf', view=False)

Development

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Additional Prerequisites

  • make (optional)

Running the tests

make test

Deployment

make clean_package || make package && make publish

Versioning

We use SemVer for versioning.

Authors

If you have any questions, feel free to ask one of our authors:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prolothar-process-discovery-6.0.0.tar.gz (36.0 MB view details)

Uploaded Source

Built Distribution

prolothar_process_discovery-6.0.0-cp311-cp311-win_amd64.whl (2.0 MB view details)

Uploaded CPython 3.11 Windows x86-64

File details

Details for the file prolothar-process-discovery-6.0.0.tar.gz.

File metadata

File hashes

Hashes for prolothar-process-discovery-6.0.0.tar.gz
Algorithm Hash digest
SHA256 163986bc6690fcc8524e435f0d9c4a0baee552eb5633a03481eaa12cb364e143
MD5 ed9561147543137c54745760ac62d4c3
BLAKE2b-256 c7aae577c1ae910e4ce855f00d58a496f40b89b152c6ad4641df32307192b540

See more details on using hashes here.

File details

Details for the file prolothar_process_discovery-6.0.0-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for prolothar_process_discovery-6.0.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 734f18bf36e8c584a9ec2f86baad09922ed79fdd193e963059135971cf2943eb
MD5 ec9dcb41e59f5abf5171e38dc40e559d
BLAKE2b-256 31d91f978cebdb64bc8574b9cf90e9b401185593c17d2718144fe7e7fbec30bb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page