Skip to main content

Based on https://github.com/pbharrin/machinelearninginaction/blob/master/Ch12/fpGrowth.py

Project description

uy

Based on https://github.com/pbharrin/machinelearninginaction/blob/master/Ch12/fpGrowth.py

To install: pip install uy

Description

The uy package implements the FP-Growth algorithm for frequent itemset mining, avoiding the costly generation of candidate sets involved in algorithms like Apriori. This implementation includes functions to construct the FP-tree, update it, and mine the frequent itemsets from it. The package is designed to efficiently find frequent itemsets in a given dataset, which is crucial for tasks such as market basket analysis, association rule learning, and anomaly detection.

Main Components

  • treeNode: A class representing a node in the FP-tree. Each node contains links to parent and child nodes, a count of occurrences, and methods to manage the node's data.
  • createTree: A function to build the FP-tree from the dataset. It also constructs a header table that helps in tree traversal.
  • updateTree: Used to add items to the FP-tree during its construction.
  • mineTree: Once the FP-tree is constructed, this function is used to mine the frequent itemsets from the tree using the header table.
  • loadSimpDat: A utility function to load a simple example dataset.
  • createInitSet: Converts a list of transactions into a dictionary format expected by createTree.

Usage Examples

Loading Data and Creating Initial Set

from uy import loadSimpDat, createInitSet

# Load example data
simpDat = loadSimpDat()

# Create initial set from data
initSet = createInitSet(simpDat)

Building the FP-Tree

from uy import createTree

# Minimum support
minSup = 3

# Create FP-tree and header table
myFPtree, myHeaderTab = createTree(initSet, minSup)

Mining Frequent Itemsets

from uy import mineTree

# List to hold the mined frequent itemsets
freqItems = []

# Mine the tree
mineTree(myFPtree, myHeaderTab, minSup, set([]), freqItems)

# Print the frequent itemsets
print(freqItems)

Documentation

Class: treeNode

  • init(self, nameValue, numOccur, parentNode): Initialize a new tree node.
  • inc(self, numOccur): Increment the count of occurrences for the node.
  • disp(self, ind=1): Display the subtree rooted at this node.
  • str(self, ind=1): Return a string representation of the subtree rooted at this node.
  • repr(self, ind=1): Return the string representation for interactive environments.

Function: createTree

  • createTree(dataSet, minSup=1): Create the FP-tree from the dataset. It returns the root of the FP-tree and the header table.

Function: updateTree

  • updateTree(items, inTree, headerTable, count): Update the FP-tree with given items.

Function: mineTree

  • mineTree(inTree, headerTable, minSup, preFix, freqItemList): Mine the FP-tree to find frequent itemsets that meet the minimum support.

Function: loadSimpDat

  • loadSimpDat(): Load a simple hardcoded dataset for demonstration purposes.

Function: createInitSet

  • createInitSet(dataSet): Convert dataset into a format suitable for the FP-tree construction.

By using these functions and classes, users can perform efficient frequent itemset mining in various datasets, which is a foundational technique in many data mining applications.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

uy-0.0.5.tar.gz (8.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

uy-0.0.5-py3-none-any.whl (8.4 kB view details)

Uploaded Python 3

File details

Details for the file uy-0.0.5.tar.gz.

File metadata

  • Download URL: uy-0.0.5.tar.gz
  • Upload date:
  • Size: 8.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.13

File hashes

Hashes for uy-0.0.5.tar.gz
Algorithm Hash digest
SHA256 b35c74bec575dc90a1b2fc2ee5a31dd01790ba9ebe66c5ada490dd6e8720f6f2
MD5 d1a4bc4b4bce40514d1b9194cd477abb
BLAKE2b-256 efbfe1ee9c135b55adfd225d00cbdf26f1c5cb44dc4a454ebd2521a19410f8a2

See more details on using hashes here.

File details

Details for the file uy-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: uy-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 8.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.13

File hashes

Hashes for uy-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 45c8d9c917ead951ca76d2bc27468f1f3bbeb4d4d10ce0e71ccd9868e29b32bb
MD5 8c65f5802ae1a0d93615972e5b61ee91
BLAKE2b-256 11b9e480f9066d5ad1cf09f73ee9b06221265fff96b6a5f74c84aa780fc4c118

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page