Skip to main content

Yet Another Sequence Analytics Toolkit - A modern Python library for sequence analysis with polars and plotnine

Project description

yasqat

Yet Another Sequence Analytics Toolkit

PyPI Python Docs License: MIT

A modern Python library for sequence analysis, inspired by TanaT and TraMineR.

Features

  • Polars-based data structures for fast sequence manipulation
  • Multiple sequence types: StateSequence, EventSequence, IntervalSequence with bidirectional conversion
  • Distance metrics: Optimal Matching, Hamming, LCS, DTW, SoftDTW, LCP, RLCP, Chi2, Euclidean, DHD, TWED, OMloc, OMspell, OMstran, NMS, NMSMST, SVRspell
  • Substitution costs: Constant, transition-rate, indels, indelslog, future (chi-squared), features (Gower distance)
  • Clustering: Hierarchical clustering, PAM (k-medoids), CLARA (sampling-based PAM)
  • Cluster quality: Silhouette scores (ASW), Point Biserial Correlation, Hubert's Gamma, R-squared, PAM range analysis, distance to center
  • Representative sequences: Extract representatives by centrality, frequency, or density
  • Discrepancy analysis: Pseudo-ANOVA (pseudo-F, pseudo-R2) with permutation tests, multi-factor ANOVA
  • Dissimilarity trees: Recursive partitioning of distance matrices by covariates
  • State recoding: Merge or rename states with automatic alphabet rebuild
  • Filtering: Length, time, state-based, and pattern filtering
  • Data I/O: CSV, JSON, Parquet, and DataFrame loading (Hive/Spark/Arrow) with polars
  • Trajectory: Multi-sequence entity analysis
  • Descriptive statistics: Entropy, transition rates, complexity, turbulence, normalized turbulence, spell counts, visited states, modal states, sequence frequencies, log-probabilities, subsequence count
  • Normative indicators: Volatility, precarity, insecurity, degradation, badness, integration, proportion positive
  • Frequent subsequence mining: Apriori-like discovery with support thresholds
  • Visualization: Index plots, distribution plots, frequency plots, spell duration plots, timeline, modal state plots, mean time plots, parallel coordinate plots
  • Synthetic data generation: Generate realistic user journey data

Installation

pip install yasqat

For development

# Clone and install with dev dependencies
git clone https://github.com/rexarski/yasqat.git
cd yasqat
uv venv
source .venv/bin/activate   # or activate.fish on fish shell
uv pip install -e ".[dev]"

Development

# Run tests
pytest tests/ -v --cov=src/yasqat

# Lint and format
ruff check src/ tests/
ruff format src/ tests/

# Type check
mypy src/

Documentation

The documentation site lives in docs/ and is built with Quarto. The live site is published to rexarski.github.io/yasqat automatically on every push to main.

Prerequisites

# 1. Install Quarto (macOS — needs your password)
brew install --cask quarto

# 2. Install dev dependencies (includes quartodoc)
uv pip install -e ".[dev]"

Preview locally

# Live-reload preview in the browser
quarto preview docs/

Render to static HTML

# Outputs to docs/_site/
quarto render docs/

Open docs/_site/index.html in a browser to inspect the result.

Regenerate API reference

The docs/api/ pages are currently hand-authored (quartodoc is blocked by a pydantic.v1 / Python 3.14 incompatibility). When that is resolved — or if you run this project with Python 3.11 — you can regenerate them automatically:

cd docs/
quartodoc build
cd ..
quarto render docs/

Commit the regenerated docs/api/ files; the CI workflow does not run quartodoc (API pages are pre-committed).

Deployment

Pushing to main triggers .github/workflows/deploy-docs.yml, which renders the site and pushes docs/_site/ to the gh-pages branch. No manual step required after the initial GitHub Pages setup:

One-time setup: Repository Settings → Pages → Source: gh-pages branch, / (root).

Adding or editing pages

What to change Where
Site structure, navbar, sidebar docs/_quarto.yml
Styles and theme docs/styles.scss
Landing page docs/index.qmd
Tutorials docs/tutorials/*.qmd
API reference docs/api/*.qmd
Changelog docs/changelog.qmd

License

MIT License - see LICENSE for details.

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yasqat-0.3.1.tar.gz (95.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yasqat-0.3.1-py3-none-any.whl (91.6 kB view details)

Uploaded Python 3

File details

Details for the file yasqat-0.3.1.tar.gz.

File metadata

  • Download URL: yasqat-0.3.1.tar.gz
  • Upload date:
  • Size: 95.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for yasqat-0.3.1.tar.gz
Algorithm Hash digest
SHA256 47e27ab0199e462e2259b872ee8dfdcd262d389b2f95edcb1ff264f044a665b7
MD5 94b0e455cde585ab9fae243daa688974
BLAKE2b-256 75039567ef19b5eff366acd3d155d52320d4b6c84cdaca7da9799e2c178aacd8

See more details on using hashes here.

File details

Details for the file yasqat-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: yasqat-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 91.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for yasqat-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 13c4aac0120939dc88ea89e8318e9fc119ec2f854959b6667787baf556a9b647
MD5 5ef18e7a1be8cfa05bbae1a7ef12fa9f
BLAKE2b-256 6666e987bba041cca0b2773e847a889255971b617c94ac9967c9800ba1756754

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page