Python interface to the R package arules
Project description
Python interface to the R package arules
arulespy
is a Python module available from PyPI.
The arules
module in arulespy
provides an easy to install Python interface to the
R package arules for association rule mining built
with rpy2
.
The R arules package implements a comprehensive infrastructure for representing, manipulating and analyzing transaction data and patterns using frequent itemsets and association rules. The package also provides a wide range of interest measures and mining algorithms including the code of Christian Borgelt’s popular and efficient C implementations of the association mining algorithms Apriori and Eclat, and optimized C/C++ code for mining and manipulating association rules using sparse matrix representation.
The arulesViz
module provides plot()
for visualizing association rules using
the R package arulesViz.
arulespy
provides Python classes
for
Transactions
: Convert pandas dataframes into transaction dataRules
: Association rulesItemsets
: Itemsets
with Phyton-style slicing and len()
.
Most arules functions are
interfaced with conversion from the R data structures to Python.
Documentation is avaialible in Python via help()
. Detailed online documentation
for the R package is available here.
Low-level arules
functions can also be directly used in the form
arules.r.<arules R function>()
. The result will be a rpy2
data type.
Transactions, itemsets and rules can manually be converted to Python
classes using the helper function a2p()
.
Installation
arulespy
is based on the python package rpy2
which requires an R installation. Here are the installation steps:
-
Install the latest version of R from https://www.r-project.org/
-
Install required libraries/set path depending on your OS:
- libcurl is needed by R package curl.
- Ubuntu:
sudo apt-get install libcurl4-openssl-dev
- MacOS:
brew install curl
- Windows: no installation necessary
- Ubuntu:
- Environment variable
R_HOME
needs to be set for Windows
- libcurl is needed by R package curl.
-
Install
arulespy
which will automatically installrpy2
andpandas
.pip install arulespy
-
Optional: Set the environment variable
R_LIBS
to decide where R packages are stored. If not set then R will determine a suitable location.
The most likely issue is rpy2
. Check python -m rpy2.situation
to see if R and R's libraries are found.
Details can be found here.
Example
from arulespy import arules
import pandas as pd
df = pd.DataFrame (
[
[True,True, True],
[True, False,False],
[True, True, True],
[True, False, False],
[True, True, True]
],
columns=list ('ABC'))
# convert dataframe to transactions
trans = arules.transactions(df)
# mine association rules
rules = arules.apriori(trans,
parameter = arules.parameters({"supp": 0.1, "conf": 0.8}),
control = arules.parameters({"verbose": False}))
# display the rules
rules.as_df()
LHS RHS support confidence coverage lift count
1 {} {A} 1.0 1.0 1.0 1.000000 5
2 {B} {C} 0.6 1.0 0.6 1.666667 3
3 {C} {B} 0.6 1.0 0.6 1.666667 3
4 {B} {A} 0.6 1.0 0.6 1.000000 3
5 {C} {A} 0.6 1.0 0.6 1.000000 3
6 {B,C} {A} 0.6 1.0 0.6 1.000000 3
7 {A,B} {C} 0.6 1.0 0.6 1.666667 3
8 {A,C} {B} 0.6 1.0 0.6 1.666667 3
Complete examples:
References
- Michael Hahsler, Sudheer Chelluboina, Kurt Hornik, and Christian Buchta. The arules R-package ecosystem: Analyzing interesting patterns from large transaction datasets. Journal of Machine Learning Research, 12:1977-1981, 2011.
- Michael Hahsler, Bettina Grün and Kurt Hornik. arules - A Computational Environment for Mining Association Rules and Frequent Item Sets. Journal of Statistical Software, 14(15), 2005.
- Hahsler, Michael. A Probabilistic Comparison of Commonly Used Interest Measures for Association Rules, 2015, URL: https://mhahsler.github.io/arules/docs/measures.
- Michael Hahsler. An R Companion for Introduction to Data Mining: Chapter 5, 2021, URL: https://mhahsler.github.io/Introduction_to_Data_Mining_R_Examples/book/
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file arulespy-0.0.2.tar.gz
.
File metadata
- Download URL: arulespy-0.0.2.tar.gz
- Upload date:
- Size: 21.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9d5ba08257dfcc05b3af37a436c5db0f49acdbba0adb34457ab864bd2526a4af |
|
MD5 | e21318835ebe119c9e2170baedb09042 |
|
BLAKE2b-256 | daea6512f3b6a4d0805fb0bb32f401e387bedac507e634bdad8fe7dd2f201c84 |
File details
Details for the file arulespy-0.0.2-py3-none-any.whl
.
File metadata
- Download URL: arulespy-0.0.2-py3-none-any.whl
- Upload date:
- Size: 19.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 852cae459058dbd1a75f467d92188de9e18ecfc81b1b6dc627e7551bbe829712 |
|
MD5 | 3ff8c1ae2eb11f3df2a289410862a644 |
|
BLAKE2b-256 | 4255dd4e8a7cbffd8d5830a6e7b2c39adfad3cf3ba3b53f2e31cd643a0a9bc62 |