Skip to main content

This software is being developed at the University of Aizu, Aizu-Wakamatsu, Fukushima, Japan

Project description

PyPI PyPI - Python Version GitHub license PyPI - Implementation Documentation Status PyPI - Wheel PyPI - Status GitHub issues GitHub forks GitHub stars Downloads Downloads Downloads

Click here for more information

Introduction


PAttern MIning (PAMI) is a Python library containing several algorithms to discover user interest-based patterns in a wide-spectrum of datasets across multiple computing platforms. Useful links to utilize the services of this library were provided below:

  1. Youtube tutorial https://www.youtube.com/playlist?list=PLKP768gjVJmDer6MajaLbwtfC9ULVuaCZ

  2. Tutorials (Notebooks) https://github.com/UdayLab/PAMI/tree/main/notebooks

  3. User manual https://udaylab.github.io/PAMI/manuals/index.html

  4. Coders manual https://udaylab.github.io/PAMI/codersManual/index.html

  5. Code documentation

  6. Datasets https://u-aizu.ac.jp/~udayrage/datasets.html

  7. Discussions on PAMI usage https://github.com/UdayLab/PAMI/discussions

  8. Report issues https://github.com/UdayLab/PAMI/issues

Recent Updates


  • Version 2023.07.07: New algorithms: cuApriroi, cuAprioriBit, cuEclat, cuEclatBit, gPPMiner, cuGPFMiner, FPStream, HUPMS, SHUPGrowth New codes to generate synthetic databases
  • Version 2023.06.20: Fuzzy Partial Periodic, Periodic Patterns in High Utility, Code Documentation, help() function Update
  • Version 2023.03.01: prefixSpan and SPADE

Total number of algorithms: 83

Features


  • ✅ Well-tested and production-ready
  • 🔋 Highly optimized to our best effort, light-weight, and energy-efficient
  • 👀 Proper code documentation
  • 🍼 Ample examples of using various algorithms at ./notebooks folder
  • 🤖 Works with AI libraries such as TensorFlow, PyTorch, and sklearn.
  • ⚡️ Supports Cuda and PySpark
  • 🖥️ Operating System Independence
  • 🔬 Knowledge discovery in static data and streams
  • 🐎 Snappy
  • 🐻 Ease of use

Table of Content


Maintenance


Installation

  1. Installing basic pami package (recommended)

    pip install pami
    
  2. Installing pami package in a GPU machine that supports CUDA

    pip install 'pami[gpu]'
    
  3. Installing pami package in a distributed network environment supporting Spark

    pip install 'pami[spark]'
    
  4. Installing pami package for developing purpose

    pip install 'pami[dev]'
    
  5. Installing complete Library of pami

    pip install 'pami[all]'
    

Upgradation

    pip install --upgrade pami

Uninstallation

    pip uninstall pami 

Information

    pip show pami

Try your first PAMI program


$ python
# first import pami 
from PAMI.frequentPattern.basic import FPGrowth as alg
fileURL = "https://u-aizu.ac.jp/~udayrage/datasets/transactionalDatabases/Transactional_T10I4D100K.csv"
minSup=300
obj = alg.FPGrowth(iFile=fileURL, minSup=minSup, sep='\t')
obj.startMine()
obj.save('frequentPatternsAtMinSupCount300.txt')
frequentPatternsDF= obj.getPatternsAsDataFrame()
print('Total No of patterns: ' + str(len(frequentPatternsDF))) #print the total number of patterns
print('Runtime: ' + str(obj.getRuntime())) #measure the runtime
print('Memory (RSS): ' + str(obj.getMemoryRSS()))
print('Memory (USS): ' + str(obj.getMemoryUSS()))
Output:
Frequent patterns were generated successfully using frequentPatternGrowth algorithm
Total No of patterns: 4540
Runtime: 8.749667644500732
Memory (RSS): 522911744
Memory (USS): 475353088

Evaluation:


  1. we compared three different Python libraries such as PAMI, mlxtend and efficient-apriori for Apriori.
  2. (Transactional_T10I4D100K.csv)is a transactional database downloaded from PAMI and used as a input file for all libraries.
  3. Minimum support values and seperator are also same.
  • The performance of the Apriori algorithm is shown in the graphical results below:
  1. Comparing the Patterns Generated by different Python libraries for the Apriori algorithm:

    Screenshot 2024-04-11 at 13 31 31
  2. Evaluating the Runtime of the Apriori algorithm across different Python libraries:

    Screenshot 2024-04-11 at 13 31 20
  3. Comparing the Memory Consumption of the Apriori algorithm across different Python libraries:

    Screenshot 2024-04-11 at 13 31 08

For more information, we have uploaded the evaluation file in two formats:

Reading Material


For more examples, refer this YouTube link YouTube

License


GitHub license

Documentation


The official documentation is hosted on PAMI.

Background


The idea and motivation to develop PAMI was from Kitsuregawa Lab at the University of Tokyo. Work on PAMI started at University of Aizu in 2020 and has been under active development since then.

Getting Help


For any queries, the best place to go to is github Issues GithubIssues.

Discussion and Development


In our GitHub repository, the primary platform for discussing development-related matters is the university lab. We encourage our team members and contributors to utilize this platform for a wide range of discussions, including bug reports, feature requests, design decisions, and implementation details.

Contribution to PAMI


We invite and encourage all community members to contribute, report bugs, fix bugs, enhance documentation, propose improvements, and share their creative ideas.

Tutorials


1. Pattern mining in binary transactional databases

1.1. Frequent pattern mining: Sample

Basic Closed Maximal Top-k CUDA pyspark
Apriori Open In Colab CHARM Open In Colab maxFP-growth Open In Colab FAE Open In Colab cudaAprioriGCT parallelApriori Open In Colab
FP-growth Open In Colab cudaAprioriTID parallelFPGrowth Open In Colab
ECLAT Open In Colab cudaEclatGCT parallelECLAT Open In Colab
ECLAT-bitSet Open In Colab
ECLAT-diffset Open In Colab

1.2. Relative frequent pattern mining: Sample

Basic
RSFP-growth Open In Colab

1.3. Frequent pattern with multiple minimum support: Sample

Basic
CFPGrowth Open In Colab
CFPGrowth++ Open In Colab

1.4. Correlated pattern mining: Sample

Basic
CoMine Open In Colab
CoMine++ Open In Colab

1.5. Fault-tolerant frequent pattern mining (under development)

Basic
FTApriori Open In Colab
FTFPGrowth (under development) Open In Colab

1.6. Coverage pattern mining (under development)

Basic
CMine Open In Colab
CMine++ Open In Colab

2. Pattern mining in binary temporal databases

2.1. Periodic-frequent pattern mining: Sample

Basic Closed Maximal Top-K
PFP-growth Open In Colab CPFP Open In Colab maxPF-growth Open In Colab kPFPMiner Open In Colab
PFP-growth++ Open In Colab Topk-PFP Open In Colab
PS-growth Open In Colab
PFP-ECLAT Open In Colab
PFPM-Compliments Open In Colab

2.2. Local periodic pattern mining: Sample

Basic
LPPGrowth (under development) Open In Colab
LPPMBreadth (under development) Open In Colab
LPPMDepth (under development) Open In Colab

2.3. Partial periodic-frequent pattern mining: Sample

Basic
GPF-growth Open In Colab
PPF-DFS Open In Colab
GPPF-DFS Open In Colab

2.4. Partial periodic pattern mining: Sample

Basic Closed Maximal topK CUDA
3P-growth Open In Colab 3P-close Open In Colab max3P-growth Open In Colab topK-3P growth Open In Colab cuGPPMiner (under development) Open In Colab
3P-ECLAT Open In Colab gPPMiner (under development) Open In Colab
G3P-Growth Open In Colab

2.5. Periodic correlated pattern mining: Sample

Basic
EPCP-growth Open In Colab

2.6. Stable periodic pattern mining: Sample

Basic TopK
SPP-growth Open In Colab TSPIN Open In Colab
SPP-ECLAT Open In Colab

2.7. Recurring pattern mining: Sample

Basic
RPgrowth Open In Colab

3. Mining patterns from binary Geo-referenced (or spatiotemporal) databases

3.1. Geo-referenced frequent pattern mining: Sample

Basic
spatialECLAT Open In Colab
FSP-growth Open In Colab

3.2. Geo-referenced periodic frequent pattern mining: Sample

Basic
GPFPMiner Open In Colab
PFS-ECLAT Open In Colab
ST-ECLAT Open In Colab

3.3. Geo-referenced partial periodic pattern mining:Sample

Basic
STECLAT Open In Colab

4. Mining patterns from Utility (or non-binary) databases

4.1. High utility pattern mining: Sample

Basic
EFIM Open In Colab
HMiner Open In Colab
UPGrowth Open In Colab

4.2. High utility frequent pattern mining: Sample

Basic
HUFIM Open In Colab

4.3. High utility geo-referenced frequent pattern mining: Sample

Basic
SHUFIM Open In Colab

4.4. High utility spatial pattern mining: Sample

Basic topk
HDSHIM Open In Colab TKSHUIM Open In Colab
SHUIM Open In Colab

4.5. Relative High utility pattern mining: Sample

Basic
RHUIM Open In Colab

4.6. Weighted frequent pattern mining: Sample

Basic
WFIM Open In Colab

4.7. Weighted frequent regular pattern mining: Sample

Basic
WFRIMiner Open In Colab

4.8. Weighted frequent neighbourhood pattern mining: Sample

Basic
SSWFPGrowth

5. Mining patterns from fuzzy transactional/temporal/geo-referenced databases

5.1. Fuzzy Frequent pattern mining: Sample

Basic
FFI-Miner Open In Colab

5.2. Fuzzy correlated pattern mining: Sample

Basic
FCP-growth Open In Colab

5.3. Fuzzy geo-referenced frequent pattern mining: Sample

Basic
FFSP-Miner Open In Colab

5.4. Fuzzy periodic frequent pattern mining: Sample

Basic
FPFP-Miner Open In Colab

5.5. Fuzzy geo-referenced periodic frequent pattern mining: Sample

Basic
FGPFP-Miner (under development) Open In Colab

6. Mining patterns from uncertain transactional/temporal/geo-referenced databases

6.1. Uncertain frequent pattern mining: Sample

Basic top-k
PUF Open In Colab TUFP
TubeP Open In Colab
TubeS Open In Colab
UVEclat

6.2. Uncertain periodic frequent pattern mining: Sample

Basic
UPFP-growth Open In Colab
UPFP-growth++ Open In Colab

6.3. Uncertain Weighted frequent pattern mining: Sample

Basic
WUFIM Open In Colab

7. Mining patterns from sequence databases

7.1. Sequence frequent pattern mining: Sample

Basic
SPADE Open In Colab
PrefixSpan Open In Colab

7.2. Geo-referenced Frequent Sequence Pattern mining

Basic
GFSP-Miner (under development) Open In Colab

8. Mining patterns from multiple timeseries databases

8.1. Partial periodic pattern mining (under development)

Basic
PP-Growth (under development) Open In Colab

9. Mining interesting patterns from Streams

  1. Frequent pattern mining
Basic
to be written
  1. High utility pattern mining
Basic
HUPMS

10. Mining patterns from contiguous character sequences (E.g., DNA, Genome, and Game sequences)

10.1. Contiguous Frequent Patterns

Basic
PositionMining Open In Colab

11. Mining patterns from Graphs

11.1. Frequent sub-graph mining

Basic topk
Gspan Open In Colab TKG Open In Colab

Go to Top

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pami-2024.4.17.1.tar.gz (580.1 kB view hashes)

Uploaded Source

Built Distribution

pami-2024.4.17.1-py3-none-any.whl (962.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page