Skip to main content

Rust-based Process Mining

Project description

r4pm

Python bindings for the Rust4PM Project: Process mining in Python with the speed of Rust

This library provides basic import/export of XES/OCEL event data, as well as other exposed functionality from the Rust4PM project (e.g., process discovery algorithms).

Features

  • Fast XES/OCEL Import/Export: Efficient Rust-based import and export of .xes, .xes.gz, and OCEL2 (.xml/.json) files
  • Auto-Generated Bindings: All process_mining functions automatically exposed with full IDE support (autocomplete, type hints, docs)
  • Registry System: Manage data objects and convert between types as needed
  • Polars DataFrames: Polars facilitates the fast transfer of event data from Python to Rust and vice versa

Quick Start

from r4pm import bindings
import r4pm

# Load an OCEL file - returns a registry ID
ocel_id = r4pm.import_item('OCEL', 'data/orders.xml')

# Convert to SlimLinkedOCEL for analysis functions
locel_id = bindings.slim_link_ocel(ocel=ocel_id)

# Get statistics
num = bindings.num_events(ocel=locel_id)
print(f"Events: {num}")

# Discover object-centric DFG
dfg = bindings.discover_dfg_from_ocel(locel_id)
print(f"Discovered DFG for {len(dfg['object_type_to_dfg'])} object types")

# For case-centric event logs:
log_id = r4pm.import_item('EventLog', 'data/log.xes')
case_dfg = bindings.discover_dfg(log_id)

How It Works

Auto-Generated Bindings

All functions from the process_mining Rust library are automatically discovered and exposed as Python functions with:

  • Full type hints for IDE autocomplete
  • Automatic documentation from Rust docs
  • Type validation via JSON schemas

The bindings are organized by module (mirroring the Rust crate structure):

from r4pm import bindings

# Top-level access to all functions
bindings.discover_dfg(event_log=log_id)
bindings.num_events(ocel=locel_id)

# Or use submodules for organization
from r4pm.bindings.discovery.case_centric import dfg
dfg.discover_dfg(event_log=log_id)

Bindings are automatically generated during the Rust build via build.rs.

Registry System

Data is managed through a registry that holds different object types:

  • OCEL - Raw OCEL data
  • SlimLinkedOCEL - Memory-efficient linked OCEL (required by most functions)
  • IndexLinkedOCEL - Indexed OCEL for analysis
  • EventLog - Case-centric event log
  • EventLogActivityProjection - Activity-projected log for discovery
# Load files into registry
ocel_id = r4pm.import_item('OCEL', 'file.xml')
log_id = r4pm.import_item('EventLog', 'file.xes')

# Convert between types (either like this or using r4pm.convert_item)
locel_id = bindings.index_link_ocel(ocel=ocel_id)
proj_id = bindings.log_to_activity_projection(log=log_id)

# List registry contents
for item in r4pm.list_items():
    print(f"{item['id']}: {item['type']}")

Simple Import/Export API

For direct DataFrame operations without the registry, use the df submodule.

XES

import r4pm

# Import returns (DataFrame, log_attributes_json)
xes, attrs = r4pm.df.import_xes("file.xes", date_format="%Y-%m-%d")
r4pm.df.export_xes(xes, "test_data/output.xes")

OCEL

# Returns dict with DataFrames: events, objects, relations, o2o, object_changes
ocel = r4pm.df.import_ocel("file.xml")
print(ocel['events'].shape)
r4pm.df.export_ocel(ocel, "export.xml")

# PM4Py integration (requires pm4py)
ocel_pm4py = r4pm.df.import_ocel_pm4py("file.xml")
print(ocel['events'].shape)
r4pm.df.export_ocel_pm4py(ocel_pm4py, "export.xml")

Development

Setup

# Install Rust: https://rustup.rs/
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Create virtual environment
python -m venv .venv
source .venv/bin/activate

# Install in development mode
pip install maturin
maturin develop --release

How Bindings Are Generated

Python bindings are automatically generated during the Rust build via build.rs. Thus, bindings are always in sync with the Rust code and do not require manual regeneration.

The build script:

  1. Reads function metadata from the process_mining crate
  2. Generates r4pm/bindings/ with typed Python wrappers and .pyi stubs
  3. Organizes functions by their Rust module structure

Building for Release

maturin build --release  # Creates wheels in target/wheels/

The wheel automatically includes the generated bindings.

Running Tests

# Run comprehensive test suite
python test_all.py

# Run simple example
python example.py

The test suite (test_all.py) covers:

  • Automatic type conversion (positional & keyword arguments)
  • Process discovery (DFG, OC-Declare)
  • Registry operations (CRUD, DataFrames, export)
  • Simple Import/Export DataFrame (df) API
  • Edge cases and conversion caching

LICENSE

This package is licensed under either Apache License Version 2.0 or MIT License at your option.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

r4pm-0.5.5a2.tar.gz (80.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

r4pm-0.5.5a2-cp314-cp314-win_amd64.whl (26.8 MB view details)

Uploaded CPython 3.14Windows x86-64

r4pm-0.5.5a2-cp314-cp314-manylinux_2_28_x86_64.whl (42.2 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

File details

Details for the file r4pm-0.5.5a2.tar.gz.

File metadata

  • Download URL: r4pm-0.5.5a2.tar.gz
  • Upload date:
  • Size: 80.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.13.3

File hashes

Hashes for r4pm-0.5.5a2.tar.gz
Algorithm Hash digest
SHA256 cc4ed3b69fb4a2df1d2c24c9e1aa71043e0c08952f856cec3bd731c7a7e2bed9
MD5 42f6186853b8b41d780998c4dc8d1565
BLAKE2b-256 fb51811e1f72c11209509469a7462d6bdabdbba5e396f6495c8536f21c31977a

See more details on using hashes here.

File details

Details for the file r4pm-0.5.5a2-cp314-cp314-win_amd64.whl.

File metadata

  • Download URL: r4pm-0.5.5a2-cp314-cp314-win_amd64.whl
  • Upload date:
  • Size: 26.8 MB
  • Tags: CPython 3.14, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.13.3

File hashes

Hashes for r4pm-0.5.5a2-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 ea503a5b6a5ac6d5f2b6cdad9bb03ec600acada9b73489cb1e6226c16707d84a
MD5 f63ad4e4c426beb91984942c0f246a8e
BLAKE2b-256 84ad4dbb12994b518add3696a4f45205ad30668e8b7dddbe1e76ba7a59befa11

See more details on using hashes here.

File details

Details for the file r4pm-0.5.5a2-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for r4pm-0.5.5a2-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 61d6b6b000427693d1adf4a64a94dbfb1f615142b5e78722217682db99f9427b
MD5 e4b11c5eb7e67eca74a9f460ba0271fe
BLAKE2b-256 67f321094acd87a844b6260efa9c3e8e525e335a69bd78f4725ca9e1bb065107

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page