Python algorithm to discover, from an event log, activity instances that are executed in a batch.
Project description
Batch Processing Discovery
This technique takes as input an event log (pd.DataFrame) recording the execution of the activities of a process with enabled, start and end timestamps, as well as the resource who performed it, and discovers which activity instances have been executed in a batch, and the characteristics of this batch processing.
The discovered characteristics are, for each batch processing:
- The activity being executed.
- The resources involved in this batch processing.
- The type of batch processing (sequential, concurrent or parallel). In case of more than one type, the most common.
- The frequency of that activity occurring as part of a batch.
- The distribution of batch sizes, i.e., for each size, the number of activity instances executed as a batch with that size.
- The distribution of durations, i.e., for each batch size, the scaling factor of the duration of the activity instances processed in that batch. For example, if the activity is processed in a 2-size batch, each activity instance lasts x0.7 what it lasts executed individually.
- The firing rules that better describe the start of the batch.
Requirements
- Python v3.9.5+
- PIP v21.1.2+
- Python dependencies: The packages listed in
requirements.txt
.
Basic Usage
Here we provide a simple example of use with default configuration (see function documentation for more parameters):
import pandas as pd
from batch_processing_discovery.batch_characteristics import discover_batch_processing_and_characteristics
from batch_processing_discovery.config import DEFAULT_CSV_IDS
# Read event log
event_log = pd.read_csv("path/to/event/log.csv.gz")
# Discover batch processing activities and their characteristics
batch_characteristics = discover_batch_processing_and_characteristics(
event_log=event_log,
log_ids=DEFAULT_CSV_IDS
)
Discover only batch processing behavior
In case of being interested only in discovering batch processing behavior, the following example applies (see function documentation for more parameters):
import pandas as pd
from batch_processing_discovery.config import DEFAULT_CSV_IDS
from batch_processing_discovery.discovery import discover_batches
# Read event log
event_log = pd.read_csv("path/to/event/log.csv.gz")
# Discover batch processing activities and their characteristics
batched_event_log = discover_batches(
event_log=event_log,
log_ids=DEFAULT_CSV_IDS
)
Get batch characteristics with already set batch processing behavior
In case of being interested only in getting the batch characteristics, based on an event log with already set batch behavior, the following example applies (see function documentation for more parameters):
import pandas as pd
from batch_processing_discovery.batch_characteristics import discover_batch_characteristics
from batch_processing_discovery.config import DEFAULT_CSV_IDS
# Read event log
event_log = pd.read_csv("path/to/event/log_with_batch_info.csv.gz")
# Discover batch processing activities and their characteristics
batch_characteristics = discover_batch_characteristics(
event_log=event_log,
log_ids=DEFAULT_CSV_IDS
)
** No enabled time available
In case of not enabled time available in the event log, consider using this Python library to estimate them.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file batch_processing_discovery-0.4.4.tar.gz
.
File metadata
- Download URL: batch_processing_discovery-0.4.4.tar.gz
- Upload date:
- Size: 13.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 976efd353b0314cc352163c160a83117e304c32ca6cb1538265ecaca0371c8ca |
|
MD5 | d27756496b722f6b0e89337799cdfa62 |
|
BLAKE2b-256 | e97cc9a18f23c974910ceeb12576e5bd7f5ba3c94562024d146cf976ea955766 |
File details
Details for the file batch_processing_discovery-0.4.4-py3-none-any.whl
.
File metadata
- Download URL: batch_processing_discovery-0.4.4-py3-none-any.whl
- Upload date:
- Size: 15.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3136dbe2a025077c1f1e6d3a79e3f67a04163b77fdd434f6c7e69ff8acca0e13 |
|
MD5 | 1954b3b32afeb6233df89c7a2688b240 |
|
BLAKE2b-256 | 8d224647d4c7483b2bc689de3600592fe22339c6367a9a1e75c18a9b777a8109 |