Skip to main content

A Python package implementing the Generalized Sequential Pattern (GSP) algorithm with concurrency support.

Project description

seq_tool

The seq_tool is a Python package that implements the Generalized Sequential Pattern (GSP) algorithm. Originally developed as part of the Course Sequencing Analysis Tool (cSAT) to analyze and sequence student course data, the toolkit has been extended to support more generalized use cases. It is designed for applications where analyzing sequential patterns is essential, such as course sequencing or other data patterns.

The package supports grouping items based on a specified granularity using concurrency and provides both a command-line interface (CLI) and a graphical user interface (GUI).

Features

  • GSP Algorithm: Analyze sequential patterns using the Generalized Sequential Pattern (GSP) algorithm.
  • Granularity-Based Grouping: Use concurrency to group items by a specified time granularity, such as semesters (quarters) or months.
  • Command-Line Interface: Run the GSP algorithm from the terminal for efficient scripting and automation.
  • Graphical User Interface: Easily configure and run the algorithm using an interactive graphical interface.

Installation

Install via command-line from PyPi:

python3 -m pip install seq-tool

Usage

Command-Line Interface

You can run the GSP algorithm using the CLI. Here’s an example:

csat-cli -i data.csv -s 50,100 -c BIO,CHEM --mode separate -o results --concurrency

For more detailed instructions and examples, please refer to the cSAT Manual.

Graphical User Interface

Launch the GUI for an easy-to-use interface:

csat-gui

The GUI allows you to:

  • Load your data file.
  • Set support thresholds and categories.
  • Group items based on granularity (e.g., semester or month).

Requirements

  • Python 3.10 or later
  • Dependencies are automatically installed when you run python3 -m pip install seq-tool.

Data Requirements

To understand the required data format, refer to the Data Dictionary.

Example Datasets

Example datasets for testing and exploring the cSAT are available here on Google Drive.

Development Roadmap

  • Current: Working on testing and validation on the general case for sequential pattern analysis. Determining how to include the time (span?) to better understand the output.
  • Future: Finalize packaging and prepare the toolkit for distribution on PyPi (Python Package Index). Additionally, explore ways to optimize the GSP algorithm, such as implementing parallel execution, to improve performance for large datasets.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seq_tool-0.0.2.tar.gz (19.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

seq_tool-0.0.2-py3-none-any.whl (20.1 kB view details)

Uploaded Python 3

File details

Details for the file seq_tool-0.0.2.tar.gz.

File metadata

  • Download URL: seq_tool-0.0.2.tar.gz
  • Upload date:
  • Size: 19.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for seq_tool-0.0.2.tar.gz
Algorithm Hash digest
SHA256 7623de3961363ff09062b71c939af1f8de82bb2b75dc75c3c4fadf4931273e7b
MD5 11c21e5c9c41d18cea815e01c1b60397
BLAKE2b-256 fc88e9c0022232114f385dec8236144594f7119993f31c30fd331af1932c42b7

See more details on using hashes here.

File details

Details for the file seq_tool-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: seq_tool-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 20.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for seq_tool-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0e79e29a67d39dcfa02982f428679733f0d4d189a4b78efc2e657f2b500cbe47
MD5 1ea0d90d0161a9fea76f5d6b0d5aaa75
BLAKE2b-256 d1d8b48b30ea66b2b60f410c60099da92d8ac53dee1674a8fdf8767022b44305

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page