Skip to main content

Python Wrapper for SPMF

Project description

SPMF

Python Wrapper for SPMF Java library.

Information

This module contains python wrappers for pattern mining algorithms implemented in SPMF Java library. Each algorithm is implemented as a standalone Python class with fully descriptive and tested APIs. It also provides native support for Pandas dataframes.

Why? If you're in a Python pipeline, it might be cumbersome to use Java as an intermediate step. Using spmf-wrapper you can stay in your pipeline as though Java is never used at all.

Installation

pip install spmf-wrapper

A Java Runtime Environment is required to run this wrapper. If an existing installation is not detected, JRE v21 is automatically installed using install-jdk python module at $HOME/.jre/jdk-21.0.2+13-jre. If you prefer to install Java Runtime manually, follow instructions here. Test installation by running the following command on the terminal:

> java -version
java version "1.8.0_391"
Java(TM) SE Runtime Environment (build 1.8.0_391-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.391-b13, mixed mode)

Usage

Example:

from spmf import EMMA

emma = EMMA(min_support=2, max_window=2, timestamp_present=True, transform=True)
output = emma.run_pandas(input_df)

Input:

Time points Itemset
0 1 a
1 2 a
2 3 a
3 3 b
4 6 a
5 7 a
6 7 b
7 8 c
8 9 b
9 11 d

Output:

Frequent episode Support
0 a 5
1 b 3
2 a b 2
3 a-> a 3
4 a -> b 2
5 a -> a b 2

See examples for more details.

For a detailed explanation of the algorithm and parameters, refer to the corresponding webpage in the SPMF documentation.

Implementation Checklist

Sequential Pattern Mining

Algorithm Type Implemented
PrefixSpan Frequent Sequential Pattern
GSP Frequent Sequential Pattern
SPADE Frequent Sequential Pattern
CM-SPADE Frequent Sequential Pattern
SPAM Frequent Sequential Pattern
CM-SPAM Frequent Sequential Pattern
FAST Frequent Sequential Pattern
LAPIN Frequent Sequential Pattern
ClaSP Frequent Closed Sequential Pattern
CM-ClaSP Frequent Closed Sequential Pattern
CloFAST Frequent Closed Sequential Pattern
CloSpan Frequent Closed Sequential Pattern
BIDE+ Frequent Closed Sequential Pattern
Post Processing SPAM or PrefixSpan Frequent Closed Sequential Pattern
MaxSP Frequent Maximal Sequential Pattern
VMSP Frequent Maximal Sequential Pattern
FEAT Frequent Sequential Generator Pattern
FSGP Frequent Sequential Generator Pattern
VGEN Frequent Sequential Generator Pattern
NOSEP Non-overlapping Sequential Pattern
GoKrimp Compressing Sequential Pattern
TKS Top-k Frequent Sequential Pattern
TSP Top-k Frequent Sequential Pattern

Episode Mining

Algorithm Type Implemented
EMMA Frequent Episode
AFEM Frequent Episode
MINEPI Frequent Episode
MINEPI+ Frequent Episode
TKE Top-k Frequent Episodes
MaxFEM Maximal Frequent Episodes
POERM Episode Rules
POERM-ALL Episode Rules
POERMH Episode Rules
NONEPI Episode Rules
TKE-Rules Episode Rules
AFEM-Rules Episode Rules
EMMA-Rules Epsiode Rules
MINEPI+-Rules Episode Rules
HUE-SPAN High Utility Episodes
US-SPAN High Utility Episodes
TUP Top-K High Utility Episodes

Bibliography

Fournier-Viger, P., Lin, C.W., Gomariz, A., Gueniche, T., Soltani, A., Deng, Z., Lam, H. T. (2016).
The SPMF Open-Source Data Mining Library Version 2.
Proc. 19th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD 2016) Part III, Springer LNCS 9853,  pp. 36-40.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spmf-wrapper-0.5.0.tar.gz (12.1 MB view details)

Uploaded Source

Built Distribution

spmf_wrapper-0.5.0-py3-none-any.whl (12.1 MB view details)

Uploaded Python 3

File details

Details for the file spmf-wrapper-0.5.0.tar.gz.

File metadata

  • Download URL: spmf-wrapper-0.5.0.tar.gz
  • Upload date:
  • Size: 12.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for spmf-wrapper-0.5.0.tar.gz
Algorithm Hash digest
SHA256 baaed021791ef20758fac5176f93c37b5c355acb522b1969aa0dd06eb68e2b88
MD5 17ebfd791f7f4ded58d297085d3dca3b
BLAKE2b-256 6089d32899a6faa8f24b343188f8e86348c8415f383a58c9681aa030b221a099

See more details on using hashes here.

File details

Details for the file spmf_wrapper-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: spmf_wrapper-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 12.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for spmf_wrapper-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 10e0791f643f2315ff2dfaa590949d30bfe698a18f743b9be21810e600c04a1a
MD5 24ee0a45650bd5b44ad0e8610d64ea10
BLAKE2b-256 798464005ffe8b5c0230e1abee387e5ad6cb8b4cb6e3ad66758df3e2046b13d2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page