Skip to main content

Sequence mining SPAM algorithm implementation

Project description

Sequence-mining

Description

This package implements SPAM algorithm described in the paper Sequential PAttern Mining using A Bitmap Representation by Jay Ayres, Johannes Gehrke, Tomi Yiu, and Jason Flannick.

This package mostly translates a SPAM implementation written in Java by Philippe Fournier-Viger, which can be found in the next repository spmf. For more information about spmf implementation, please, read the Web page Mining Frequent Sequential Patterns Using The CM-SPAM Algorithm

Installation

pip3 install sequence-mining

Usage

from sequence_mining.spam import SpamAlgo

# Input: List[List[List[int]]] or Sequences [ Sequence [Transaction[ids] ]]
# ids in a transaction are expected to be lexicographically sorted
sequences = [
    [[0, 2, 10, 13, 14, 15, 18, 20], [2, 7, 12, 15, 17, 19], [6, 12, 19], [0, 3, 4, 6, 15], [1, 3, 10, 13, 15],
     [8, 10], [4, 8, 9, 10]],
    [[9, 10, 17], [4], [0, 1, 2, 3, 4, 5, 12, 13, 19], [0, 1, 5, 10, 17, 18], [4, 7, 12], [2, 8, 9, 13, 15, 16, 19],
     [3, 5, 6, 9, 11, 13, 18, 19], [2, 5, 9, 10, 13, 16, 20], [2, 3, 6]],
    [[0, 9, 10, 13, 14, 19, 20], [0, 1, 9, 15, 17], [1, 7, 11, 12, 15, 20], [7, 9, 10, 11, 14, 18], [0, 10],
     [5, 13, 15], [1, 5, 9, 15], [1, 5, 7, 8, 19], [2, 6, 11, 14, 16], [3, 10, 11, 12]],
    [[15], [6, 9, 10, 12, 13, 15, 16], [13, 16]]
]

algo = SpamAlgo(0.7)
algo.spam(sequences)
# print mined sequences
print(algo.frequent_items)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sequence_mining-0.0.3.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

sequence_mining-0.0.3-py3-none-any.whl (7.2 kB view details)

Uploaded Python 3

File details

Details for the file sequence_mining-0.0.3.tar.gz.

File metadata

  • Download URL: sequence_mining-0.0.3.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.6

File hashes

Hashes for sequence_mining-0.0.3.tar.gz
Algorithm Hash digest
SHA256 aef3bed4dc44dfdac064671d56b1241e741e64271368ec4563ad33b635cd48ca
MD5 156c1a1152cb76f6fcbc2cf56192ac64
BLAKE2b-256 2fb0d1051d60f34505938a84f5fe56e623af13cb696d843de3493c2a6426f14c

See more details on using hashes here.

File details

Details for the file sequence_mining-0.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for sequence_mining-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 ba24bbeeca971ebc20df0ac545c9a5f4a6d0e8e9c70d2cd07a53d23977664c5a
MD5 1a58ee049c395dfe8aeaac334eef88bb
BLAKE2b-256 f523006bae822d844e94f9fa488389f5718c252ec71ec1fb9b5ecf27ebfe1f31

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page