A package for association analysis using the ECLAT method.
Project description
pyECLAT
Unlike the a priori method, the ECLAT method is not based on the calculation of confidence and lift, therefore the ECLAT method is based on the calculation of the support conjunctions of the variables.
pyECLAT is a simple package for associating variables based on the support of the different items of a dataframe.This method returns two dictionaries, one with the frequency of occurrence of the items conjunctions and the other with the support of the items conjunctions.
Install
Via pip
pip3 install pyECLAT
Via github
git clone https://github.com/jeffrichardchemistry/pyECLAT
cd pyECLAT
python3 setup.py install
Dependencies
numpy>=1.17.4, pandas>=0.25.3, tqdm>=4.41.1
How to use
This package has two dataframes as example, its possible to use:
from pyECLAT import Example1, Example2
ex1 = Example1().get()
ex2 = Example2().get()
The working dataframe should look like the one below. In this case, each line represents a customer's purchase at a supermarket.
0 | 1 | 2 | 3 | |
---|---|---|---|---|
0 | milk | beer | bread | butter |
1 | coffe | bread | butter | NaN |
2 | coffe | bread | butter | NaN |
3 | milk | coffe | bread | butter |
4 | beer | NaN | NaN | NaN |
5 | butter | NaN | NaN | NaN |
6 | bread | NaN | NaN | NaN |
7 | bean | NaN | NaN | NaN |
8 | rice | bean | NaN | NaN |
9 | rice | NaN | NaN | NaN |
This package works directly with a pandas dataframe without column's name. Example: Making your dataframe
import pandas as pd
dataframe = pd.read_csv('dir/of/file.csv', header=None)
Run ECLAT method:
from pyECLAT import ECLAT
eclat_instance = ECLAT(data=dataframe, verbose=True) #verbose=True to see the loading bar
After getting eclat_instance, a binary dataframe is automatically generated, among other resources that can be accessed:
eclat_instance.df_bin #generate a binary dataframe, that can be used for other analyzes.
eclat_instance.uniq_ #a list with all the names of the different items
eclat_instance.support, eclat_instance.fit and eclat_instance.fit_all are the functions to perform the calculations. Example:
get_ECLAT_indexes, get_ECLAT_supports = eclat_instance.fit(min_support=0.08,
min_combination=1,
max_combination=3,
separator=' & ',
verbose=True)
It is possible to access the documentation, as well as the description, of each method using:
help(eclat_instance.fit)
help(eclat_instance.fit_all)
help(eclat_instance.support)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pyECLAT-1.0.2.linux-x86_64.tar.gz
.
File metadata
- Download URL: pyECLAT-1.0.2.linux-x86_64.tar.gz
- Upload date:
- Size: 7.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.6.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8137cb7b56de716a13c5e08b10c12aaaa1cce58c7fe965ad7d5194e8fa6fcf44 |
|
MD5 | d62c02892a29807ca07a48ed8b80a00c |
|
BLAKE2b-256 | e91ff3a17df6b7a8610dc865b838c44431c922f63f5d782150df225232f2629f |
File details
Details for the file pyECLAT-1.0.2-py3-none-any.whl
.
File metadata
- Download URL: pyECLAT-1.0.2-py3-none-any.whl
- Upload date:
- Size: 6.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.6.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8052cd762269f11dbad85e27e8dfed28668824b30fb54a2549138c0f642c43ab |
|
MD5 | ff9c431a5ef3b0d02673bb2158f5a669 |
|
BLAKE2b-256 | 2eb73b5b1fc70e917d2e2b7182dfc0f5b26a573621ee2c64538a7d72bf3f1f15 |