Skip to main content

Algorithms for association Rule mining

Project description

Temporal Generalized Association Rules

This library provides four algorithms related to Association Rule mining. You can download this repository as a package with:

pip install TemporalGeneralizedRules

The algorithms are:

  • Apriori
  • Cumulate
  • HTAR
  • HTGAR

These algorithms use a transactional dataset that is transformed to a vertical format for optimization. Dataset MUST follow the following format:

order_id product_name
1 Bread
1 Milk
2 Bread
2 Beer
3 Eggs

Or if timestamps are provided:

order_id timestamp product_name
1 852087600 Bread
1 852087600 Milk
2 854420400 Bread
2 854420400 Beer
3 854420400 Eggs

For taxonomy file use the following format (don't provide headers):

child parent
Bread Dairy
Milk Dairy
Beer Beverage

One line for each child, parent

Each field is separated by ","

TGAR

This is the main class that must be instantiated once.

Usage

import TemporalGeneralizedRules

tgar = TemporalGeneralizedRules.TGAR()

Apriori

This algorithm has four parameters:

  • filepath: Filepath of the dataset in csv format with the format discussed in the previous section.
  • min_supp: Minimum support threshold.
  • min_conf: Minimum confidence threshold.
  • parallel_count: Optional parameter that enables parallelization in candidate count phase of the algorithm.

Usage

tgar.apriori("dataset.csv", 0.05, 0.5)

Cumulate

This algorithm has six parameters:

  • filepath: Filepath of the dataset in csv format with the format discussed in the previous section.
  • taxonomy_filepath: Filepath of the taxonomy in csv format with the format discussed in the previous section.
  • min_supp: Minimum support threshold.
  • min_conf: Minimum confidence threshold.
  • min_r: Minimum R-interesting threshold.
  • parallel_count: Optional parameter that enables parallelization in candidate count phase of the algorithm. It can make execution faster.

Usage

tgar.cumulate("dataset.csv", 0.05, 0.5, 1.1)

HTAR

This algorithm has four parameters:

  • filepath: Filepath of the dataset in csv format with the format discussed in the previous section.
  • min_supp: Minimum support threshold.
  • min_conf: Minimum confidence threshold.
  • parallel_count: Optional parameter that enables parallelization in candidate count phase of the algorithm. It can make execution faster.

Usage

tgar.htar("dataset.csv", 0.05, 0.5)

HTGAR

This algorithm has six parameters:

  • filepath: Filepath of the dataset in csv format with the format discussed in the previous section.
  • taxonomy_filepath: Filepath of the taxonomy in csv format with the format discussed in the previous section.
  • min_supp: Minimum support threshold.
  • min_conf: Minimum confidence threshold.
  • min_r: Minimum R-interesting threshold.
  • parallel_count: Optional parameter that enables parallelization in candidate count phase of the algorithm. It can make execution faster.

Usage

tgar.htgar("dataset.csv", 0.05, 0.5, 1.1)

Pypy

For a better performance we recommend using this package with Pypy, a faster implementation of python.

https://www.pypy.org/

Bibliography

The algorithms provided in this library were based on the following papers:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

TemporalGeneralizedRules-1.0.2.tar.gz (19.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

TemporalGeneralizedRules-1.0.2-py3-none-any.whl (23.2 kB view details)

Uploaded Python 3

File details

Details for the file TemporalGeneralizedRules-1.0.2.tar.gz.

File metadata

  • Download URL: TemporalGeneralizedRules-1.0.2.tar.gz
  • Upload date:
  • Size: 19.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.4

File hashes

Hashes for TemporalGeneralizedRules-1.0.2.tar.gz
Algorithm Hash digest
SHA256 aadab622412c079d7ebe46781220081291d787c0bebc67fd657a14e7bdcef71f
MD5 f6597d882144bcedded4a922d36e3427
BLAKE2b-256 9e46732247f1450523ed10114c3a1a01197e974c1d5033fc440c0ed6707af690

See more details on using hashes here.

File details

Details for the file TemporalGeneralizedRules-1.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for TemporalGeneralizedRules-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 12ee7055c558bc57b2accd8bc33152fbb5a575dc995426e7914529a778a52bb7
MD5 cabdb70b172a3685d6387ba39534f57b
BLAKE2b-256 f18ca94542771a3fa49b9255d3e6c5aefc714598ceae38c1be07d570bd986aeb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page