Skip to main content

Rust-based Process Mining

Project description

r4pm

Python bindings for the Rust4PM Project: Process mining in Python with the speed of Rust

This library provides basic import/export of XES/OCEL event data, as well as other exposed functionality from the Rust4PM project (e.g., process discovery algorithms).

Features

  • Fast XES/OCEL Import/Export: Efficient Rust-based import and export of .xes, .xes.gz, and OCEL2 (.xml/.json) files
  • Auto-Generated Bindings: All process_mining functions automatically exposed with full IDE support (autocomplete, type hints, docs)
  • Registry System: Manage data objects and convert between types as needed
  • Polars DataFrames: Polars facilitates the fast transfer of event data from Python to Rust and vice versa

Quick Start

from r4pm import bindings
import r4pm

# Load an OCEL file - returns a registry ID
ocel_id = r4pm.import_item('OCEL', 'data/orders.xml')

# Convert to SlimLinkedOCEL for analysis functions
locel_id = bindings.slim_link_ocel(ocel=ocel_id)

# Get statistics
num = bindings.num_events(ocel=locel_id)
print(f"Events: {num}")

# Discover object-centric DFG
dfg = bindings.discover_dfg_from_ocel(locel_id)
print(f"Discovered DFG for {len(dfg['object_type_to_dfg'])} object types")

# For case-centric event logs:
log_id = r4pm.import_item('EventLog', 'data/log.xes')
case_dfg = bindings.discover_dfg(log_id)

How It Works

Auto-Generated Bindings

All functions from the process_mining Rust library are automatically discovered and exposed as Python functions with:

  • Full type hints for IDE autocomplete
  • Automatic documentation from Rust docs
  • Type validation via JSON schemas

The bindings are organized by module (mirroring the Rust crate structure):

from r4pm import bindings

# Top-level access to all functions
bindings.discover_dfg(event_log=log_id)
bindings.num_events(ocel=locel_id)

# Or use submodules for organization
from r4pm.bindings.discovery.case_centric import dfg
dfg.discover_dfg(event_log=log_id)

Bindings are automatically generated during the Rust build via build.rs.

Registry System

Data is managed through a registry that holds different object types:

  • OCEL - Raw OCEL data
  • SlimLinkedOCEL - Memory-efficient linked OCEL (required by most functions)
  • IndexLinkedOCEL - Indexed OCEL for analysis
  • EventLog - Case-centric event log
  • EventLogActivityProjection - Activity-projected log for discovery
# Load files into registry
ocel_id = r4pm.import_item('OCEL', 'file.xml')
log_id = r4pm.import_item('EventLog', 'file.xes')

# Convert between types (either like this or using r4pm.convert_item)
locel_id = bindings.index_link_ocel(ocel=ocel_id)
proj_id = bindings.log_to_activity_projection(log=log_id)

# List registry contents
for item in r4pm.list_items():
    print(f"{item['id']}: {item['type']}")

Simple Import/Export API

For direct DataFrame operations without the registry, use the df submodule.

XES

import r4pm

# Import returns (DataFrame, log_attributes_json)
xes, attrs = r4pm.df.import_xes("file.xes", date_format="%Y-%m-%d")
r4pm.df.export_xes(xes, "test_data/output.xes")

OCEL

# Returns dict with DataFrames: events, objects, relations, o2o, object_changes
ocel = r4pm.df.import_ocel("file.xml")
print(ocel['events'].shape)
r4pm.df.export_ocel(ocel, "export.xml")

# PM4Py integration (requires pm4py)
ocel_pm4py = r4pm.df.import_ocel_pm4py("file.xml")
print(ocel['events'].shape)
r4pm.df.export_ocel_pm4py(ocel_pm4py, "export.xml")

Development

Setup

# Install Rust: https://rustup.rs/
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Create virtual environment
python -m venv .venv
source .venv/bin/activate

# Install in development mode
pip install maturin
maturin develop --release

How Bindings Are Generated

Python bindings are automatically generated during the Rust build via build.rs. Thus, bindings are always in sync with the Rust code and do not require manual regeneration.

The build script:

  1. Reads function metadata from the process_mining crate
  2. Generates r4pm/bindings/ with typed Python wrappers and .pyi stubs
  3. Organizes functions by their Rust module structure

Building for Release

maturin build --release  # Creates wheels in target/wheels/

The wheel automatically includes the generated bindings.

Running Tests

# Run comprehensive test suite
python test_all.py

# Run simple example
python example.py

The test suite (test_all.py) covers:

  • Automatic type conversion (positional & keyword arguments)
  • Process discovery (DFG, OC-Declare)
  • Registry operations (CRUD, DataFrames, export)
  • Simple Import/Export DataFrame (df) API
  • Edge cases and conversion caching

LICENSE

This package is licensed under either Apache License Version 2.0 or MIT License at your option.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

r4pm-0.5.5a4.tar.gz (80.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

r4pm-0.5.5a4-cp314-cp314-win_amd64.whl (26.9 MB view details)

Uploaded CPython 3.14Windows x86-64

r4pm-0.5.5a4-cp314-cp314-manylinux_2_28_x86_64.whl (42.3 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

File details

Details for the file r4pm-0.5.5a4.tar.gz.

File metadata

  • Download URL: r4pm-0.5.5a4.tar.gz
  • Upload date:
  • Size: 80.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.13.3

File hashes

Hashes for r4pm-0.5.5a4.tar.gz
Algorithm Hash digest
SHA256 669ee7ad889af066cd4ceb6d38fc396e4dc05a40d662c37493f95c51c1bb63cd
MD5 f71d08d80946cbf0ac9c5be324378706
BLAKE2b-256 c2b40b13d29d103841ba0ea5e316b22b445ded3ec5f365ccc040834fa54e218e

See more details on using hashes here.

File details

Details for the file r4pm-0.5.5a4-cp314-cp314-win_amd64.whl.

File metadata

  • Download URL: r4pm-0.5.5a4-cp314-cp314-win_amd64.whl
  • Upload date:
  • Size: 26.9 MB
  • Tags: CPython 3.14, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.13.3

File hashes

Hashes for r4pm-0.5.5a4-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 2f98eb2aeb1c5460891fb3f4bafe09a04ba47968aaca8fc14c80e7a45503348c
MD5 6586ddbe8113d15d777626504371e2ad
BLAKE2b-256 101de6673684893d6a083e1424687c93d064cf98f2bdb44202f639f397992059

See more details on using hashes here.

File details

Details for the file r4pm-0.5.5a4-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for r4pm-0.5.5a4-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 3cb888343447866c5c01e06372a0a663defe038b33833dd43b9882bf4cb7942e
MD5 38c7e72700b51a1874c9e38c4bb32c5c
BLAKE2b-256 8d3641ebacfcba1a16599d89d98cbf874d8c2874f0495221ea027aab209fc276

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page