Skip to main content

multi-purpose association rules analysis

Project description

Arules - multi-purpose association rules

Arules is an open-source python package for association rules creation. It allows creation of association rules over tabular data (pandas dataframe). While standard association rules require transactional data, arules considers association rules as an analysis utility for categorical data. The Package also supports association rules over continuous data by application of binning methods (some basic methods are included in the package but users can define their own binning functions).

Installation

Python 3.6+ | Linux, Mac OS X, Windows

pip install -U arules

Getting Started

Let's create some association rules over some tabular data

import pandas as pd

anes96 = pd.read_csv("anes96.csv")
anes96.head()

| popul | TVnews | selfLR                 | ClinLR            | DoleLR                | PID              | age  | educ                 | income                   | vote    | logpopul           |
|-------|--------|------------------------|-------------------|-----------------------|------------------|------|----------------------|--------------------------|---------|--------------------|
| 0.0   | 7.0    | Extremely Conservative | Extremely liberal | Conservative          | Strong Republica | 36.0 | High school graduate | None or less than $2,999 | Dole    | -2.302585092994045 |
| 190.0 | 1.0    | Slightly liberal       | Slightly liberal  | Slightly conservative | Weak Democrat    | 20.0 | Some college         | None or less than $2,999 | Clinton | 5.247550249494384  |
| 31.0  | 7.0    | Liberal                | Liberal           | Conservative          | Weak Democrat    | 24.0 | Master's degree      | None or less than $2,999 | Clinton | 3.4372078191851885 |
| 83.0  | 4.0    | Slightly liberal       | Moderate          | Slightly conservative | Weak Democrat    | 28.0 | Master's degree      | None or less than $2,999 | Clinton | 4.4200447018614035 |
| 640.0 | 7.0    | Slightly conservative  | Conservative      | Moderate              | Strong Democrat  | 68.0 | Master's degree      | None or less than $2,999 | Clinton | 6.461624414147957  |

Note that the table contains both categorical and continuous fields (which can be handled using a selected binning method). Now we use arules to extract association rules according to a specification of interest

import arules as ar
from arules.utils import five_quantile_based_bins, top_bottom_10, top_5_variant_variables

rules, supp_dict = ar.create_association_rules(anes96,max_cols=2,binning_method=five_quantile_based_bins)

After the calculation is done we can present rules of selection for analysis purposes

ar.present_rules_per_consequent(rules,consequent={'vote':'Clinton'},
                                selection_function=top_5_variant_variables, drop_dups=True,
                                plot=True)

PID rules selfLR rules ClinLR rules DoleLR rules Income rules

As we set the consequent to be: {'vote':'Clinton'}, the presented rules reflect the likelihood of an individual to vote for clinton given the respective feature. For example, if we consider the income variable above, a person with an income of 3,000-4,999 (which populates, according to the barchart, 1% of the sample) is approximately 1.6 times more likely (w.r.t. the average) to vote for Clinton, while a person with an income of 90,000-104,999 (which populates, according to the barchart, 4% of the sample) is approximately 1.4 times less likely to vote for Clinton.

Contributing

Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.

Versioning

We use SemVer for versioning. For the versions available.

Authors

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arules-0.0.0.tar.gz (16.3 kB view details)

Uploaded Source

Built Distribution

arules-0.0.0-py3-none-any.whl (17.9 kB view details)

Uploaded Python 3

File details

Details for the file arules-0.0.0.tar.gz.

File metadata

  • Download URL: arules-0.0.0.tar.gz
  • Upload date:
  • Size: 16.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.6.2 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.6.8

File hashes

Hashes for arules-0.0.0.tar.gz
Algorithm Hash digest
SHA256 a4a437679b810f144cc16cc451090c78251a8c6fd26bdfaa63ad502db73fba79
MD5 b57ce7e1d65f3adad55dcb566a27e054
BLAKE2b-256 c8386838990ca068f9df021af270f47ddac3b5cac1eb59cc653ed552404ba84c

See more details on using hashes here.

File details

Details for the file arules-0.0.0-py3-none-any.whl.

File metadata

  • Download URL: arules-0.0.0-py3-none-any.whl
  • Upload date:
  • Size: 17.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.6.2 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.6.8

File hashes

Hashes for arules-0.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 766bcdb169e9dd3303346c196bd4c734dc631d04c5ca8a2c5e4f27af07d9c97b
MD5 748f3376e5f7225afe341517c078bed6
BLAKE2b-256 412c5759ba98957f791a689e0f255f56f88b344fca892dcdf7d4bedf9bd3ab65

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page