Clustering of association rules based on user defined thresholds.
Project description
coar
coar is implementation of clustering of association rules based on user defined thresholds.
Installation
Use the package manager pip to install coar.
pip install coar
Usage
Usage is displayed on association rules mined using Cleverminer using modified version of CleverMiner quickstart example. You need to install cleverminer first.
pip install cleverminer
Mining association rules using cleverminer:
# imports
import json
import pandas as pd
from cleverminer import cleverminer
# getting the source file
df = pd.read_csv(
'https://www.cleverminer.org/hotel.zip',
encoding='cp1250',
sep='\t'
)
# selecting the columns
df = df[['VTypeOfVisit', 'GState', 'GCity']]
# mining association rules
clm = cleverminer(
df=df, proc='4ftMiner',
quantifiers={'conf': 0.6, 'Base': 50},
ante={
'attributes': [
{'name': 'GState', 'type': 'subset', 'minlen': 1, 'maxlen': 1},
{'name': 'GCity', 'type': 'subset', 'minlen': 1, 'maxlen': 1},
], 'minlen': 1, 'maxlen': 2, 'type': 'con'},
succ={
'attributes': [
{'name': 'VTypeOfVisit', 'type': 'subset', 'minlen': 1, 'maxlen': 1}
], 'minlen': 1, 'maxlen': 1, 'type': 'con'},
)
# saving rules to file
with open('rules.json', 'w') as save_file:
save_file.write(json.dumps(clm.rulelist))
Clustering rules using coar:
# imports
import json
import pandas as pd
from coar.cluster import agglomerative_clustering, cluster_representative
# loading rules
rule_file = open('rules.json')
rule_list = json.loads(rule_file.read())
# creating dataframe
df = pd.DataFrame.from_records([{
'antecedent': set(attr for attr in rule['cedents_str']['ante'].split(' & ')),
'succedent': set(attr for attr in rule['cedents_str']['succ'].split(' & ')),
'support': rule['params']['rel_base'],
'confidence': rule['params']['conf']
} for rule in rule_list])
# clustering
clustering = agglomerative_clustering(
df,
abs_ante_attr_diff_threshold=1,
abs_succ_attr_diff_threshold=0,
abs_supp_diff_threshold=1,
abs_conf_diff_threshold=1,
)
# getting cluster representatives
clusters_repr = cluster_representative(clustering)
Contributing
If you find a bug 🐛, please open a bug report. If you have an idea for an improvement, new feature or enhancement 🚀, please open a feature request.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
coar-1.22.tar.gz
(6.5 kB
view details)
File details
Details for the file coar-1.22.tar.gz
.
File metadata
- Download URL: coar-1.22.tar.gz
- Upload date:
- Size: 6.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fa2dfc7a632ef71913b32893f20112d0938026f994d3793c2baf2a163b9d1413 |
|
MD5 | 7df31e7bf5bf84d235f763554cc7e59b |
|
BLAKE2b-256 | eed7cf153adc927d858d30762966dcc6f1e0addaeaa8c96710ee2ac85b616a6b |