so4gp · PyPI

Some features may not work without JavaScript. Please try enabling it if you encounter problems.

Some Python optimization algorithms for mining gradual patterns.

These details have not been verified by PyPI

Project links

Meta

License: MIT License (MIT License Copyright (c) 2021 Dickson Owuor (Ph.D.) Permission is hereby granted, free of charge,...)
Author: Dickson Owuor
Maintainer: Dickson Owuor
Tags gradual patterns, GRAANK, ant-colony-optimization, data-mining, swarm-intelligence
Requires: Python >=3.9

Development Status
- 4 - Beta
Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3.9
Topic
- Scientific/Engineering :: Artificial Intelligence

Project description

SO4GP stands for: "Some Optimizations for Gradual Patterns". SO4GP applies optimizations such as swarm intelligence, HDF5 chunks, cluster analysis and many others in order to improve the efficiency of extracting gradual patterns. It provides Python algorithm implementations for these optimization techniques. The algorithm implementations include:

(Classical) GRAANK algorithm for extracting GPs
Ant Colony Optimization algorithm for extracting GPs
Genetic Algorithm for extracting GPs
Particle Swarm Optimization algorithm for extracting GPs
Random Search algorithm for extracting GPs
Local Search algorithm for extracting GPs
Clustering-based algorithm for extracting GPs

A GP (Gradual Pattern) is a set of gradual items (GI) and its quality is measured by its computed support value. For example given a data set with 3 columns (age, salary, cars) and 10 objects. A GP may take the form: {age+, salary-} with a support of 0.8. This implies that 8 out of 10 objects have the values of column age 'increasing' and column 'salary' decreasing.

Install Requirements

Before running so4gp, make sure you install the following Python Packages:

pip3 install numpy>=1.23.2 pandas>=1.4.4 python-dateutil>=2.8.2 ypstruct>=0.0.2 scikit-learn>=1.1.2

Usage

In order to run each algorithm for the purpose of extracting GPs, follow the instructions that follow.

First and foremost, import the so4gp python package via:

import so4gp as sgp

1. GRAdual rANKing Algorithm for GPs (GRAANK)

This is the classical approach (initially proposed by Anne Laurent) for mining gradual patterns. All the remaining algorithms are variants of this algorithm.

mine_obj = sgp.GRAANK(data_source=f_path, min_sup=0.5, eq=False)
gp_json = mine_obj.discover()
print(gp_json)

where you specify the parameters as follows:

data_src - [required] data source {either a file in csv format or a Pandas DataFrame}
min_sup - [optional] minimum support default = 0.5
eq - [optional] encode equal values as gradual default = False

2. Ant Colony Optimization for GPs (ACO-GRAANK)

In this approach, it is assumed that every column can be converted into gradual item (GI). If the GI is valid (i.e. its computed support is greater than the minimum support threshold) then it is either increasing or decreasing (+ or -), otherwise it is irrelevant (x). Therefore, a pheromone matrix is built using the number of columns and the possible variations (increasing, decreasing, irrelevant) or (+, -, x). The algorithm starts by randomly generating GP candidates using the pheromone matrix, each candidate is validated by confirming that its computed support is greater or equal to the minimum support threshold. The valid GPs are used to update the pheromone levels and better candidates are generated.

mine_obj = sgp.AntGRAANK(data_src)
gp_json = mine_obj.discover()
print(gp_json)

where you specify the parameters as follows:

data_src - [required] data source {either a file in csv format or a Pandas DataFrame}
min_sup - [optional] minimum support default = 0.5
max_iteration - [optional] maximum number of iterations default = 1
evaporation_factor - [optional] evaporation factor default = 0.5

3. Genetic Algorithm for GPs (GA-GRAANK)

In this approach, it is assumed that every GP candidate may be represented as a binary gene (or individual) that has a unique position and cost. The cost is derived from the computed support of that candidate, the higher the support value the lower the cost. The aim of the algorithm is search through a population of individuals (or candidates) and find those with the lowest cost as efficiently as possible.

mine_obj = sgp.GeneticGRAANK(data_src)
gp_json = mine_obj.discover()
print(gp_json)

where you specify the parameters as follows:

data_src - [required] data source {either a file in csv format or a Pandas DataFrame}
min_sup - [optional] minimum support default = 0.5
max_iteration - [optional] maximum number of algorithm iterations default = 1
n_pop - [optional] initial population default = 5
pc - [optional] offspring population multiple default = 0.5
gamma - [optional] crossover rate default = 1
mu - [optional] mutation rate default = 0.9
sigma - [optional] mutation rate default = 0.9

4. Particle Swarm Optimization for GPs (PSO-GRAANK)

In this approach, it is assumed that every GP candidate may be represented as a particle that has a unique position and fitness. The fitness is derived from the computed support of that candidate, the higher the support value the higher the fitness. The aim of the algorithm is search through a population of particles (or candidates) and find those with the highest fitness as efficiently as possible.

mine_obj = sgp.ParticleGRAANK(data_src)
gp_json = mine_obj.discover()
print(gp_json)

where you specify the parameters as follows:

data_src - [required] data source {either a file in csv format or a Pandas DataFrame}
min_sup - [optional] minimum support default = 0.5
max_iteration - [optional] maximum number of algorithm iterations default = 1
n_particles - [optional] initial particle population default = 5
velocity - [optional] particle velocity default = 0.9
coeff_p - [optional] personal coefficient rate default = 0.01
coeff_g - [optional] global coefficient default = 0.9

5. Local Search for GPs (LS-GRAANK)

In this approach, it is assumed that every GP candidate may be represented as a position that has a cost value associated with it. The cost is derived from the computed support of that candidate, the higher the support value the lower the cost. The aim of the algorithm is search through group of positions and find those with the lowest cost as efficiently as possible.

mine_obj = sgp.HillClimbingGRAANK(data_src, min_sup)
gp_json = mine_obj.discover()
print(gp_json)

where you specify the parameters as follows:

data_src - [required] data source {either a file in csv format or a Pandas DataFrame}
min_sup - [optional] minimum support default = 0.5
max_iteration - [optional] maximum number of algorithm iterations default = 1
step_size - [optional] step size default = 0.5

6. Random Search for GPs (RS-GRAANK)

In this approach, it is assumed that every GP candidate may be represented as a position that has a cost value associated with it. The cost is derived from the computed support of that candidate, the higher the support value the lower the cost. The aim of the algorithm is search through group of positions and find those with the lowest cost as efficiently as possible.

import so4gp as sgp

mine_obj = sgp.RandomGRAANK(data_src, min_sup)
gp_json = mine_obj.discover()
print(gp_json)

where you specify the parameters as follows:

data_src - [required] data source {either a file in csv format or a Pandas DataFrame}
min_sup - [optional] minimum support default = 0.5
max_iteration - [optional] maximum number of algorithm iterations default = 1

7. Clustering algorithm for GPs (Clu-BFS)

We borrow the net-win concept used in the work 'Clustering Using Pairwise Comparisons' proposed by R. Srikant to the problem of extracting gradual patterns (GPs). In order to mine for GPs, each feature yields 2 gradual items which we use to construct a bitmap matrix comparing each row to each other (i.e., (r1,r2), (r1,r3), (r1,r4), (r2,r3), (r2,r4), (r3,r4)).

In this approach, we convert the bitmap matrices into 'net-win vectors'. Finally, we apply spectral clustering to determine which gradual items belong to the same group based on the similarity of net-win vectors. Gradual items in the same cluster should have almost similar score vector.

import so4gp as sgp

mine_obj = sgp.ClusterGP(data_source=data_src, min_sup=0.5, e_prob=0.1)
gp_json = mine_obj.discover()
print(gp_json)

where you specify the parameters as follows:

data_src - [required] data source {either a file in csv format or a Pandas DataFrame}
min_sup - [optional] minimum support default = 0.5
e_probability - [optional] erasure probability default = 0.5
max_iteration - [optional] maximum iterations for estimating score vectors default = 10

Sample Output

The default output is the format of JSON:

{
	"Algorithm": "RS-GRAANK",
	"Best Patterns": [
            [["Age+", "Salary+"], 0.6], 
            [["Expenses-", "Age+", "Salary+"], 0.6]
	],
	"Iterations": 20
}

References

Owuor, D., Runkler T., Laurent A., Menya E., Orero J (2021), Ant Colony Optimization for Mining Gradual Patterns. International Journal of Machine Learning and Cybernetics. https://doi.org/10.1007/s13042-021-01390-w
Dickson Owuor, Anne Laurent, and Joseph Orero (2019). Mining Fuzzy-temporal Gradual Patterns. In the proceedings of the 2019 IEEE International Conference on Fuzzy Systems (FuzzIEEE). IEEE. https://doi.org/10.1109/FUZZ-IEEE.2019.8858883.
Laurent A., Lesot MJ., Rifqi M. (2009) GRAANK: Exploiting Rank Correlations for Extracting Gradual Itemsets. In: Andreasen T., Yager R.R., Bulskov H., Christiansen H., Larsen H.L. (eds) Flexible Query Answering Systems. FQAS 2009. Lecture Notes in Computer Science, vol 5822. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04957-6_33

Project details

These details have not been verified by PyPI

Project links

Meta

License: MIT License (MIT License Copyright (c) 2021 Dickson Owuor (Ph.D.) Permission is hereby granted, free of charge,...)
Author: Dickson Owuor
Maintainer: Dickson Owuor
Tags gradual patterns, GRAANK, ant-colony-optimization, data-mining, swarm-intelligence
Requires: Python >=3.9

Development Status
- 4 - Beta
Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3.9
Topic
- Scientific/Engineering :: Artificial Intelligence

Release history Release notifications | RSS feed

0.6.5

Nov 13, 2024

0.6.4

Nov 12, 2024

0.6.3

Nov 9, 2024

0.6.2

Nov 9, 2024

0.6.1

Nov 9, 2024

0.6.0

Nov 8, 2024

0.5.9

Nov 8, 2024

0.5.8

Nov 7, 2024

0.5.7

Nov 7, 2024

This version

0.5.6

Nov 7, 2024

0.5.5

Nov 5, 2024

0.5.4

Nov 4, 2024

0.5.3

Nov 4, 2024

0.5.2

Oct 30, 2024

0.5.1

Oct 29, 2024

0.5.0

Oct 29, 2024

0.4.9

Oct 29, 2024

0.4.7

Oct 26, 2024

0.4.6.5

Jul 31, 2024

0.4.6

Jul 31, 2024

0.4.5

Jul 29, 2024

0.4.4

Jul 22, 2024

0.4.3

Jul 22, 2024

0.4.2

Jul 17, 2024

0.4.1

Jul 4, 2024

0.4.0

Jul 4, 2024

0.3.9

Jul 12, 2023

0.3.8

Oct 27, 2022

0.3.7

Oct 19, 2022

0.3.6

Oct 19, 2022

0.3.5

Oct 17, 2022

0.3.4

Oct 14, 2022

0.3.3

Oct 13, 2022

0.3.2

Oct 7, 2022

0.3.1

Oct 4, 2022

0.3.0

Oct 4, 2022

0.2.9

Sep 16, 2022

0.2.8

Sep 8, 2022

0.2.7

Sep 8, 2022

0.2.6

Sep 1, 2022

0.2.5

Aug 31, 2022

0.2.4

Jul 8, 2022

0.2.3

Jun 15, 2022

0.2.2

Apr 23, 2022

0.2.1

Apr 23, 2022

0.2.0

Apr 23, 2022

0.1.9

Apr 20, 2022

0.1.8

Apr 20, 2022

0.1.7

Mar 17, 2022

0.1.6

Mar 2, 2022

0.1.5

Mar 1, 2022

0.1.4

Jan 11, 2022

0.1.3

Jan 11, 2022

0.1.2

Jan 11, 2022

0.1.1

Jan 11, 2022

0.1.0

Jan 6, 2022

0.0.7

Jan 6, 2022

0.0.6

Jan 6, 2022

0.0.5

Sep 15, 2021

0.0.4

Sep 15, 2021

0.0.3

Jul 22, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

so4gp-0.5.6.tar.gz (40.1 kB view details)

Uploaded Nov 7, 2024 Source

Built Distribution

so4gp-0.5.6-py3-none-any.whl (39.8 kB view details)

Uploaded Nov 7, 2024 Python 3

File details

Details for the file so4gp-0.5.6.tar.gz.

File metadata

Download URL: so4gp-0.5.6.tar.gz
Upload date: Nov 7, 2024
Size: 40.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for so4gp-0.5.6.tar.gz
Algorithm	Hash digest
SHA256	`97fec21ef8a9ef6e98f77835db7d4a1a89bec6fc633fdf98d14963f445e75096`
MD5	`9f6c0b7a13db8f2c07e1380794921fe4`
BLAKE2b-256	`d96914436933debafd5bcc6f940f15c2670a3c0c0509f8ceba20f3f9841f83be`

See more details on using hashes here.

File details

Details for the file so4gp-0.5.6-py3-none-any.whl.

File metadata

Download URL: so4gp-0.5.6-py3-none-any.whl
Upload date: Nov 7, 2024
Size: 39.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for so4gp-0.5.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4e0c12984c9868b565b22c25bb13dd7c39fdaa0317b3c040178c61c0672a6b46`
MD5	`04b5df9cd8cca9d36092d8aff96bbe98`
BLAKE2b-256	`e62f50a70df0fd458038c0192e90fdbbe06b3549325ccc9225740f9f80af6438`

See more details on using hashes here.

Supported by

AWS

AWS Cloud computing and Security Sponsor

Datadog

Datadog Monitoring

Fastly

Google

Google Download Analytics

Microsoft

Microsoft PSF Sponsor

Pingdom

Pingdom Monitoring

Sentry

Sentry Error logging

StatusPage

StatusPage Status page