algorithms for prediction and rule mining on event sequences

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Prolothar Rule Mining

Algorithms to learn classification and event sequence prediction rules for event sequence datasets such as process logs.

Based on the publication

Boris Wiegand, Dietrich Klakow, and Jilles Vreeken. Discovering Interpretable Data-to-Sequence Generators. In: Proceedings of the 36th AAAI Conference on Artificial Intelligence (AAAI), Virtual Event. 2022, pp. 4237â€“4244.

Prerequisites

Python 3.11+

Usage

If you want to run the algorithms on your own data, follow the steps below.

Installing

pip install prolothar-rule-mining

Creating or reading a dataset of sequences with metadata

You can create datasets manually by

from prolothar_common.models.dataset import TargetSequenceDataset
from prolothar_common.models.dataset.instance import TargetSequenceInstance

#define a list of categorical variables and a list of numeric variables
dataset = TargetSequenceDataset(['color'],['size'])

# add instances, where each instance has three parts:
# 1. a unique hashable ID (e.g. of type int or str)
# 2. a dictionary with attribute names and attribute values
# 3. a (potentially empty) list or tuple of events of type str
dataset.add_instance(TargetSequenceInstance(
    1, {'color': 'red', 'size': 100}, []
))
dataset.add_instance(TargetSequenceInstance(
    2, {'color': 'blue', 'size': 42}, ['A', 'B']
))

Alternatively, you can read a dataset from an .arff file:

from prolothar_common.models.dataset import TargetSequenceDataset

with open('dataset.arff', 'r') as f:
   dataset = TargetSequenceDataset.create_from_arff(f.read(), 'sequence')

Exemplary .arff file:

@RELATION "TestDataset"

@ATTRIBUTE "color" {"blue","red"}
@ATTRIBUTE "size" NUMERIC
@ATTRIBUTE "sequence" {"[]","[A,B]"}

@DATA
"red",100,"[]"
"blue",42,"[A,B]"

Discovering an Event-flow Graph Using ConSequence

from prolothar_rule_mining.rule_miner.data_to_sequence.consequence import ConSequence

consequence = ConSequence()
rules_model = consequence.mine_rules(dataset)

#make predictions
for instance in dataset:
    print('=================')
    print(instance.get_target_sequence())
    print(rules_model.execute(instance))

#get and print the event flow graph
graph = rules_model.get_event_flow_graph()
graph.plot()
graph.plot(view=False, filepath='path_to_pdf')

#get and print the classification rule at each node
for node, router in rules_model.get_node_router_table().items():
    print('===============================')
    print(f'rule at node {node}')
    print(router.get_rule())
    # alternative: print(router.get_rule().to_html())

Development

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Additional Prerequisites

make (optional)

Compile Cython code

make cython

Running the tests

make test

Deployment

Change the version in version.txt
Build and publish the package on pypi by

make clean_package
make package && make publish

Create and push a tag for this version by

git tag -a [version] -m "describe this version"
git push --tags

Versioning

We use SemVer for versioning.

Authors

If you have any questions, feel free to ask one of our authors:

Boris Wiegand - boris.wiegand@stahl-holding-saar.de

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

3.0.2

Apr 15, 2024

3.0.1

Feb 16, 2024

3.0.0

Feb 16, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prolothar-rule-mining-3.0.2.tar.gz (1.3 MB view hashes)

Uploaded Apr 15, 2024 Source

Built Distribution

prolothar_rule_mining-3.0.2-cp311-cp311-win_amd64.whl (2.1 MB view hashes)

Uploaded Apr 15, 2024 CPython 3.11 Windows x86-64

Hashes for prolothar-rule-mining-3.0.2.tar.gz

Hashes for prolothar-rule-mining-3.0.2.tar.gz
Algorithm	Hash digest
SHA256	`a11b88d5d9650cd84247d0d8c62a9486d906f42103c7e6e67dc3f6f6d01665f6`
MD5	`089b35643f7688870f94b46b81497d55`
BLAKE2b-256	`a367ca2b9a6856c6c5a56adb00df39fbf30ab8a750c34110f622e888270f1ba8`

Hashes for prolothar_rule_mining-3.0.2-cp311-cp311-win_amd64.whl

Hashes for prolothar_rule_mining-3.0.2-cp311-cp311-win_amd64.whl
Algorithm	Hash digest
SHA256	`4f9f2fa27365164221b4b89f33abbc6e055946cb57a1045b9ab7f8eb5984732f`
MD5	`63d207f8e34701d397ce0bf71e809cc1`
BLAKE2b-256	`01f29a28908622166cf3b8edb7838c951c7d5b174a8bc07f5eaa67e931666c59`