Skip to main content

Rust-based Process Mining

Project description

r4pm

Python bindings for the Rust4PM Project: Process mining in Python with the speed of Rust

This library provides basic import/export of XES/OCEL event data, as well as other exposed functionality from the Rust4PM project (e.g., process discovery algorithms).

Features

  • Fast XES/OCEL Import/Export: Efficient Rust-based import and export of .xes, .xes.gz, and OCEL2 (.xml/.json) files
  • Auto-Generated Bindings: All process_mining functions automatically exposed with full IDE support (autocomplete, type hints, docs)
  • Registry System: Manage data objects and convert between types as needed
  • Polars DataFrames: Polars facilitates the fast transfer of event data from Python to Rust and vice versa

Quick Start

from r4pm import bindings
import r4pm

# Load an OCEL file - returns a registry ID
ocel_id = r4pm.import_item('OCEL', 'data/orders.xml')

# Convert to SlimLinkedOCEL for analysis functions
locel_id = bindings.slim_link_ocel(ocel=ocel_id)

# Get statistics
num = bindings.num_events(ocel=locel_id)
print(f"Events: {num}")

# Discover object-centric DFG
dfg = bindings.discover_dfg_from_ocel(locel_id)
print(f"Discovered DFG for {len(dfg['object_type_to_dfg'])} object types")

# For case-centric event logs:
log_id = r4pm.import_item('EventLog', 'data/log.xes')
case_dfg = bindings.discover_dfg(log_id)

How It Works

Auto-Generated Bindings

All functions from the process_mining Rust library are automatically discovered and exposed as Python functions with:

  • Full type hints for IDE autocomplete
  • Automatic documentation from Rust docs
  • Type validation via JSON schemas

The bindings are organized by module (mirroring the Rust crate structure):

from r4pm import bindings

# Top-level access to all functions
bindings.discover_dfg(event_log=log_id)
bindings.num_events(ocel=locel_id)

# Or use submodules for organization
from r4pm.bindings.discovery.case_centric import dfg
dfg.discover_dfg(event_log=log_id)

Bindings are automatically generated during the Rust build via build.rs.

Registry System

Data is managed through a registry that holds different object types:

  • OCEL - Raw OCEL data
  • SlimLinkedOCEL - Memory-efficient linked OCEL (required by most functions)
  • IndexLinkedOCEL - Indexed OCEL for analysis
  • EventLog - Case-centric event log
  • EventLogActivityProjection - Activity-projected log for discovery
# Load files into registry
ocel_id = r4pm.import_item('OCEL', 'file.xml')
log_id = r4pm.import_item('EventLog', 'file.xes')

# Convert between types (either like this or using r4pm.convert_item)
locel_id = bindings.index_link_ocel(ocel=ocel_id)
proj_id = bindings.log_to_activity_projection(log=log_id)

# List registry contents
for item in r4pm.list_items():
    print(f"{item['id']}: {item['type']}")

Simple Import/Export API

For direct DataFrame operations without the registry, use the df submodule.

XES

import r4pm

# Import returns (DataFrame, log_attributes_json)
xes, attrs = r4pm.df.import_xes("file.xes", date_format="%Y-%m-%d")
r4pm.df.export_xes(xes, "test_data/output.xes")

OCEL

# Returns dict with DataFrames: events, objects, relations, o2o, object_changes
ocel = r4pm.df.import_ocel("file.xml")
print(ocel['events'].shape)
r4pm.df.export_ocel(ocel, "export.xml")

# PM4Py integration (requires pm4py)
ocel_pm4py = r4pm.df.import_ocel_pm4py("file.xml")
print(ocel['events'].shape)
r4pm.df.export_ocel_pm4py(ocel_pm4py, "export.xml")

Development

Setup

# Install Rust: https://rustup.rs/
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Create virtual environment
python -m venv .venv
source .venv/bin/activate

# Install in development mode
pip install maturin
maturin develop --release

How Bindings Are Generated

Python bindings are automatically generated during the Rust build via build.rs. Thus, bindings are always in sync with the Rust code and do not require manual regeneration.

The build script:

  1. Reads function metadata from the process_mining crate
  2. Generates r4pm/bindings/ with typed Python wrappers and .pyi stubs
  3. Organizes functions by their Rust module structure

Building for Release

maturin build --release  # Creates wheels in target/wheels/

The wheel automatically includes the generated bindings.

Running Tests

# Run comprehensive test suite
python test_all.py

# Run simple example
python example.py

The test suite (test_all.py) covers:

  • Automatic type conversion (positional & keyword arguments)
  • Process discovery (DFG, OC-Declare)
  • Registry operations (CRUD, DataFrames, export)
  • Simple Import/Export DataFrame (df) API
  • Edge cases and conversion caching

LICENSE

This package is licensed under either Apache License Version 2.0 or MIT License at your option.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

r4pm-0.5.5a3.tar.gz (80.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

r4pm-0.5.5a3-cp314-cp314-win_amd64.whl (26.9 MB view details)

Uploaded CPython 3.14Windows x86-64

r4pm-0.5.5a3-cp314-cp314-manylinux_2_28_x86_64.whl (42.5 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

File details

Details for the file r4pm-0.5.5a3.tar.gz.

File metadata

  • Download URL: r4pm-0.5.5a3.tar.gz
  • Upload date:
  • Size: 80.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.13.3

File hashes

Hashes for r4pm-0.5.5a3.tar.gz
Algorithm Hash digest
SHA256 21ccacaf1701b67716820707057df06ac37e88883a6496448fbf82e212cb3a41
MD5 b43a46f3b18eefa2b7924102eba18965
BLAKE2b-256 e46c5d1904e090a6cd23e0132b0c041f5996f46ad9ba78ef10433f553d5fdfc0

See more details on using hashes here.

File details

Details for the file r4pm-0.5.5a3-cp314-cp314-win_amd64.whl.

File metadata

  • Download URL: r4pm-0.5.5a3-cp314-cp314-win_amd64.whl
  • Upload date:
  • Size: 26.9 MB
  • Tags: CPython 3.14, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.13.3

File hashes

Hashes for r4pm-0.5.5a3-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 97bc0cde2d090ecf8a1f46e44600176570b5e28a5f57f7afebc51154b8de7332
MD5 9f1269c4a356f247d24e3ceea9fa6440
BLAKE2b-256 0941a1f3ad79469ba415d399c5e84b1fe7e667696f2713b2e3c350e2f67b495e

See more details on using hashes here.

File details

Details for the file r4pm-0.5.5a3-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for r4pm-0.5.5a3-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ce4b06fab5ffb0cf1fc22c83b4042d20205bac2131357b0c969066f9c31bf9c2
MD5 0e4b0562f19002560226887212259f6f
BLAKE2b-256 c4e6d6b281d45dbc08273577ac1a16d10c631b9bde766fea8aeba836eb61dc19

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page