A Python package implementing the Generalized Sequential Pattern (GSP) algorithm with concurrency support.
Project description
seq_tool
The seq_tool is a Python package that implements the Generalized Sequential Pattern (GSP) algorithm. Originally developed as part of the Course Sequencing Analysis Tool (CSAT) to analyze and sequence student course data, the toolkit has been extended to support more generalized use cases. It is designed for applications where analyzing sequential patterns is essential, such as course sequencing or other data patterns.
The package supports grouping items based on a specified granularity using concurrency and provides both a command-line interface (CLI) and a graphical user interface (GUI).
Features
- GSP Algorithm: Analyze sequential patterns using the Generalized Sequential Pattern (GSP) algorithm.
- Granularity-Based Grouping: Use concurrency to group items by a specified time granularity, such as semesters (quarters) or months.
- Command-Line Interface: Run the GSP algorithm from the terminal for efficient scripting and automation.
- Graphical User Interface: Easily configure and run the algorithm using an interactive graphical interface.
Installation
Install from command-line via PyPi project:
pip install seq-tool
Usage
Command-Line Interface
You can run the GSP algorithm using the CLI. Here’s an example:
seq-cli -i data.csv -s 50,100 -c BIO,CHEM --mode separate -o results --concurrency
For more detailed instructions and examples, please refer to the CSAT Manual.
Graphical User Interface
Launch the GUI for an easy-to-use interface:
seq-gui
The GUI allows you to:
- Load your data file.
- Set support thresholds and categories.
- Group items based on granularity (e.g., semester or month).
Requirements
- Python 3.10 or later
- Dependencies are automatically installed when you run
pip install seq-tool.
Data Requirements
To understand the required data format, refer to the Data Dictionary.
Example Datasets
Example datasets for testing and exploring the CSAT are available here on Google Drive.
Development Roadmap
- Current: Exploring runtime. Potentially find ways to optimize the algorithm to improve performance for large datasets, such as parallel execution.
- Future: Determining how to include the time (span?) to better understand the output.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file seq_tool-0.0.4.tar.gz.
File metadata
- Download URL: seq_tool-0.0.4.tar.gz
- Upload date:
- Size: 19.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
82f8a091eb0c687158e5b2f602805dbfbb7db966304ca976bf6f95bb6791221b
|
|
| MD5 |
f615f18cff67d7467e66711cec9d4a3d
|
|
| BLAKE2b-256 |
53e2cdf35ae99db544fb7060960ba84ef304735165284d6f9712b3af411b8069
|
File details
Details for the file seq_tool-0.0.4-py3-none-any.whl.
File metadata
- Download URL: seq_tool-0.0.4-py3-none-any.whl
- Upload date:
- Size: 20.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4a8035dee1b0c330c3d0017716b1f585d8c4a95239b7327c38a8acce52c86435
|
|
| MD5 |
7b3a9ca3a8ae9200cd0b13a5b45a92a3
|
|
| BLAKE2b-256 |
375e664677aaff08a7dc998c1c7d10f78e68aa02710c9eba9f2750e4822e5f64
|