Seq2Pat: Sequence-to-Pattern Generation Library
Project description
Seq2Pat: Sequence-to-Pattern Generation Library
Seq2Pat is a research library for sequence-to-pattern generation to discover sequential patterns that occur frequently in large sequence databases. The library supports constraint-based reasoning to specify desired properties over patterns.
From an algorithmic perspective, the library takes advantage of multi-valued decision diagrams. It is based on the state-of-the-art approach for sequential pattern mining from Hosseininasab et. al. AAAI 2019.
From an implementation perspective, the library is written in Cython
that brings together the efficiency of a low-level C++
backend and
the expressiveness of a high-level Python
public interface.
Seq2Pat is developed as a joint collaboration between Fidelity Investments and the Tepper School of Business at CMU. Documentation is available at fidelity.github.io/seq2pat.
Quick Start
# Example to show how to find frequent sequential patterns
# from a given sequence database subject to constraints
from sequential.seq2pat import Seq2Pat, Attribute
# Seq2Pat over 3 sequences
seq2pat = Seq2Pat(sequences=[["A", "A", "B", "A", "D"],
["C", "B", "A"],
["C", "A", "C", "D"]])
# Price attribute corresponding to each item
price = Attribute(values=[[5, 5, 3, 8, 2],
[1, 3, 3],
[4, 5, 2, 1]])
# Average price constraint
seq2pat.add_constraint(3 <= price.average() <= 4)
# Patterns that occur at least twice (A-D)
patterns = seq2pat.get_patterns(min_frequency=2)
Available Constraints
The library offers various constraint types, including a number of non-monotone constraints.
- Average: This constraint specifies the average value of an attribute across all events in a pattern.
- Gap: This constraint specifies the difference between the attribute values of every two consecutive events in a pattern.
- Median: This constraint specifies the median value of an attribute across all events in a pattern.
- Span: This constraint specifies the difference between the maximum and the minimum value of an attribute across all events in a pattern.
Usage Examples
Examples on how to use the available constraints can be found in the Jupyter Notebook.
Installation
Seq2Pat can be installed from PyPI using pip install seq2pat
. It can also be installed from source by following the instructions in
our documentation.
Requirements
The library requires Python 3.6+
, the Cython
package, and a C++
compiler.
See requirements.txt for dependencies.
Support
Please submit bug reports and feature requests as Issues.
License
Seq2Pat is licensed under the GNU GPL License 2.0.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for seq2pat-1.2.1-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 34c8c4969faad8a278eb60be34364f758f89b5ff385d39fec3f497484ccdfa3b |
|
MD5 | 5f6093cd1f733862e233985387422f3e |
|
BLAKE2b-256 | a6341afba9b83b4a10c404f65b2dcc5db44d5efa6f3059f8dac19c8ac5c0b779 |