Seq2Pat: Sequence-to-Pattern Generation Library
Project description
Seq2Pat: Sequence-to-Pattern Generation Library
Seq2Pat is a research library for sequence-to-pattern generation to discover sequential patterns that occur frequently in large sequence databases. The library supports constraint-based reasoning to specify desired properties over patterns.
From an algorithmic perspective, the library takes advantage of multi-valued decision diagrams. It is based on the state-of-the-art approach for sequential pattern mining from Hosseininasab et. al. AAAI 2019.
From an implementation perspective, the library is written in Cython
that brings together the efficiency of a low-level C++
backend and
the expressiveness of a high-level Python
public interface.
Seq2Pat is developed as a joint collaboration between Fidelity Investments and the Tepper School of Business at CMU.
Quick Start
# Example to show how to find frequent sequential patterns
# from a given sequence database subject to constraints
from sequential.seq2pat import Seq2Pat, Attribute
# Seq2Pat over 3 sequences
seq2pat = Seq2Pat(sequences=[["A", "A", "B", "A", "D"],
["C", "B", "A"],
["C", "A", "C", "D"]])
# Price attribute corresponding to each item
price = Attribute(values=[[5, 5, 3, 8, 2],
[1, 3, 3],
[4, 5, 2, 1]])
# Average price constraint
seq2pat.add_constraint(3 <= price.average() <= 4)
# Patterns that occur at least twice (A-D)
patterns = seq2pat.get_patterns(min_frequency=2)
Available Constraints
The library offers various constraint types, including a number of non-monotone constraints.
- Average: This constraint specifies the average value of an attribute across all events in a pattern.
- Gap: This constraint specifies the difference between the attribute values of every two consecutive events in a pattern.
- Median: This constraint specifies the median value of an attribute across all events in a pattern.
- Span: This constraint specifies the difference between the maximum and the minimum value of an attribute across all events in a pattern.
Usage Examples
Examples on how to use the available constraints can be found in the Jupyter Notebook.
Installation
Seq2Pat can be installed from source by following the instructions in our documentation.
The installation consists of two main steps:
- Build the backend
- Install the library
Requirements
The library requires Python 3.6+
, the Cython
package, and a C++
compiler.
See requirements.txt for dependencies.
Support
Please submit bug reports and feature requests as Issues.
License
Seq2Pat is licensed under the GNU GPL License 2.0.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for seq2pat-1.1.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 84ec88838b2bc38ba4daea9c90a9ded3789da43ac43eee5024fa609b5ae1d9a1 |
|
MD5 | 42c06cb62fe9852456b0f6ecf5d4e4ef |
|
BLAKE2b-256 | 8037f12e27bfbc8e30ea86418f6665e1b6298951ee987a548edba457b7876f56 |