Based on https://github.com/pbharrin/machinelearninginaction/blob/master/Ch12/fpGrowth.py
Project description
uy
Based on https://github.com/pbharrin/machinelearninginaction/blob/master/Ch12/fpGrowth.py
To install: pip install uy
Description
The uy package implements the FP-Growth algorithm for frequent itemset mining, avoiding the costly generation of candidate sets involved in algorithms like Apriori. This implementation includes functions to construct the FP-tree, update it, and mine the frequent itemsets from it. The package is designed to efficiently find frequent itemsets in a given dataset, which is crucial for tasks such as market basket analysis, association rule learning, and anomaly detection.
Main Components
- treeNode: A class representing a node in the FP-tree. Each node contains links to parent and child nodes, a count of occurrences, and methods to manage the node's data.
- createTree: A function to build the FP-tree from the dataset. It also constructs a header table that helps in tree traversal.
- updateTree: Used to add items to the FP-tree during its construction.
- mineTree: Once the FP-tree is constructed, this function is used to mine the frequent itemsets from the tree using the header table.
- loadSimpDat: A utility function to load a simple example dataset.
- createInitSet: Converts a list of transactions into a dictionary format expected by
createTree.
Usage Examples
Loading Data and Creating Initial Set
from uy import loadSimpDat, createInitSet
# Load example data
simpDat = loadSimpDat()
# Create initial set from data
initSet = createInitSet(simpDat)
Building the FP-Tree
from uy import createTree
# Minimum support
minSup = 3
# Create FP-tree and header table
myFPtree, myHeaderTab = createTree(initSet, minSup)
Mining Frequent Itemsets
from uy import mineTree
# List to hold the mined frequent itemsets
freqItems = []
# Mine the tree
mineTree(myFPtree, myHeaderTab, minSup, set([]), freqItems)
# Print the frequent itemsets
print(freqItems)
Documentation
Class: treeNode
- init(self, nameValue, numOccur, parentNode): Initialize a new tree node.
- inc(self, numOccur): Increment the count of occurrences for the node.
- disp(self, ind=1): Display the subtree rooted at this node.
- str(self, ind=1): Return a string representation of the subtree rooted at this node.
- repr(self, ind=1): Return the string representation for interactive environments.
Function: createTree
- createTree(dataSet, minSup=1): Create the FP-tree from the dataset. It returns the root of the FP-tree and the header table.
Function: updateTree
- updateTree(items, inTree, headerTable, count): Update the FP-tree with given items.
Function: mineTree
- mineTree(inTree, headerTable, minSup, preFix, freqItemList): Mine the FP-tree to find frequent itemsets that meet the minimum support.
Function: loadSimpDat
- loadSimpDat(): Load a simple hardcoded dataset for demonstration purposes.
Function: createInitSet
- createInitSet(dataSet): Convert dataset into a format suitable for the FP-tree construction.
By using these functions and classes, users can perform efficient frequent itemset mining in various datasets, which is a foundational technique in many data mining applications.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file uy-0.0.5.tar.gz.
File metadata
- Download URL: uy-0.0.5.tar.gz
- Upload date:
- Size: 8.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b35c74bec575dc90a1b2fc2ee5a31dd01790ba9ebe66c5ada490dd6e8720f6f2
|
|
| MD5 |
d1a4bc4b4bce40514d1b9194cd477abb
|
|
| BLAKE2b-256 |
efbfe1ee9c135b55adfd225d00cbdf26f1c5cb44dc4a454ebd2521a19410f8a2
|
File details
Details for the file uy-0.0.5-py3-none-any.whl.
File metadata
- Download URL: uy-0.0.5-py3-none-any.whl
- Upload date:
- Size: 8.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
45c8d9c917ead951ca76d2bc27468f1f3bbeb4d4d10ce0e71ccd9868e29b32bb
|
|
| MD5 |
8c65f5802ae1a0d93615972e5b61ee91
|
|
| BLAKE2b-256 |
11b9e480f9066d5ad1cf09f73ee9b06221265fff96b6a5f74c84aa780fc4c118
|