Skip to main content

Rust-based Process Mining

Project description

r4pm

Python bindings for the Rust4PM Project: Process mining in Python with the speed of Rust

This library provides basic import/export of XES/OCEL event data, as well as other exposed functionality from the Rust4PM project (e.g., process discovery algorithms).

Features

  • Fast XES/OCEL Import/Export: Efficient Rust-based import and export of .xes, .xes.gz, and OCEL2 (.xml/.json) files
  • Auto-Generated Bindings: All process_mining functions automatically exposed with full IDE support (autocomplete, type hints, docs)
  • Registry System: Manage data objects and convert between types as needed
  • Polars DataFrames: Polars facilitates the fast transfer of event data from Python to Rust and vice versa

Quick Start

from r4pm import bindings
import r4pm

# Load an OCEL file - returns a registry ID
ocel_id = r4pm.import_item('OCEL', 'data/orders.xml')

# Convert to SlimLinkedOCEL for analysis functions
locel_id = bindings.slim_link_ocel(ocel=ocel_id)

# Get statistics
num = bindings.num_events(ocel=locel_id)
print(f"Events: {num}")

# Discover object-centric DFG
dfg = bindings.discover_dfg_from_ocel(locel_id)
print(f"Discovered DFG for {len(dfg['object_type_to_dfg'])} object types")

# For case-centric event logs:
log_id = r4pm.import_item('EventLog', 'data/log.xes')
case_dfg = bindings.discover_dfg(log_id)

How It Works

Auto-Generated Bindings

All functions from the process_mining Rust library are automatically discovered and exposed as Python functions with:

  • Full type hints for IDE autocomplete
  • Automatic documentation from Rust docs
  • Type validation via JSON schemas

The bindings are organized by module (mirroring the Rust crate structure):

from r4pm import bindings

# Top-level access to all functions
bindings.discover_dfg(event_log=log_id)
bindings.num_events(ocel=locel_id)

# Or use submodules for organization
from r4pm.bindings.discovery.case_centric import dfg
dfg.discover_dfg(event_log=log_id)

Bindings are automatically generated during the Rust build via build.rs.

Registry System

Data is managed through a registry that holds different object types:

  • OCEL - Raw OCEL data
  • SlimLinkedOCEL - Memory-efficient linked OCEL (required by most functions)
  • IndexLinkedOCEL - Indexed OCEL for analysis
  • EventLog - Case-centric event log
  • EventLogActivityProjection - Activity-projected log for discovery
# Load files into registry
ocel_id = r4pm.import_item('OCEL', 'file.xml')
log_id = r4pm.import_item('EventLog', 'file.xes')

# Convert between types (either like this or using r4pm.convert_item)
locel_id = bindings.index_link_ocel(ocel=ocel_id)
proj_id = bindings.log_to_activity_projection(log=log_id)

# List registry contents
for item in r4pm.list_items():
    print(f"{item['id']}: {item['type']}")

Simple Import/Export API

For direct DataFrame operations without the registry, use the df submodule.

XES

import r4pm

# Import returns (DataFrame, log_attributes_json)
xes, attrs = r4pm.df.import_xes("file.xes", date_format="%Y-%m-%d")
r4pm.df.export_xes(xes, "test_data/output.xes")

OCEL

# Returns dict with DataFrames: events, objects, relations, o2o, object_changes
ocel = r4pm.df.import_ocel("file.xml")
print(ocel['events'].shape)
r4pm.df.export_ocel(ocel, "export.xml")

# PM4Py integration (requires pm4py)
ocel_pm4py = r4pm.df.import_ocel_pm4py("file.xml")
print(ocel['events'].shape)
r4pm.df.export_ocel_pm4py(ocel_pm4py, "export.xml")

Development

Setup

# Install Rust: https://rustup.rs/
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Create virtual environment
python -m venv .venv
source .venv/bin/activate

# Install in development mode
pip install maturin
maturin develop --release

How Bindings Are Generated

Python bindings are automatically generated during the Rust build via build.rs. Thus, bindings are always in sync with the Rust code and do not require manual regeneration.

The build script:

  1. Reads function metadata from the process_mining crate
  2. Generates r4pm/bindings/ with typed Python wrappers and .pyi stubs
  3. Organizes functions by their Rust module structure

Building for Release

maturin build --release  # Creates wheels in target/wheels/

The wheel automatically includes the generated bindings.

Running Tests

# Run comprehensive test suite
python test_all.py

# Run simple example
python example.py

The test suite (test_all.py) covers:

  • Automatic type conversion (positional & keyword arguments)
  • Process discovery (DFG, OC-Declare)
  • Registry operations (CRUD, DataFrames, export)
  • Simple Import/Export DataFrame (df) API
  • Edge cases and conversion caching

LICENSE

This package is licensed under either Apache License Version 2.0 or MIT License at your option.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

r4pm-0.5.5a5.tar.gz (80.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

r4pm-0.5.5a5-cp314-cp314-win_amd64.whl (26.9 MB view details)

Uploaded CPython 3.14Windows x86-64

r4pm-0.5.5a5-cp314-cp314-manylinux_2_28_x86_64.whl (42.4 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

File details

Details for the file r4pm-0.5.5a5.tar.gz.

File metadata

  • Download URL: r4pm-0.5.5a5.tar.gz
  • Upload date:
  • Size: 80.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.13.3

File hashes

Hashes for r4pm-0.5.5a5.tar.gz
Algorithm Hash digest
SHA256 785e690577ec8dd5938a3b70e27fc9f2fb1593c72bc500758457bbac266c5d59
MD5 0c5d4c4041d7718df1796d600dfe5eda
BLAKE2b-256 9d6138e84c2d3889531d631cf4815173534dfa761cf6a521af7561ff577bbdf6

See more details on using hashes here.

File details

Details for the file r4pm-0.5.5a5-cp314-cp314-win_amd64.whl.

File metadata

  • Download URL: r4pm-0.5.5a5-cp314-cp314-win_amd64.whl
  • Upload date:
  • Size: 26.9 MB
  • Tags: CPython 3.14, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.13.3

File hashes

Hashes for r4pm-0.5.5a5-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 1042ed540402e6318a89494e236f6aee942f77d5d59e37bbe0018dccf5cc9583
MD5 7f0233a4193840e200cfafb8bdd79af3
BLAKE2b-256 a8ef623f33adc09b6da87d646b119af8c4b18621a01ea2c558001a382fca140e

See more details on using hashes here.

File details

Details for the file r4pm-0.5.5a5-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for r4pm-0.5.5a5-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 17923483a0eebdc5d52dc2c51e63249f6f29b5d64738cd88a9944e02a8b158f0
MD5 164352c127aebb60febc5de3ff5e5e99
BLAKE2b-256 d7e8854a73a0cd8bb1c1b1945e782e044f0e5a289bd4c744125c7ea45acc8b7f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page