Skip to main content

A machine learning interface for isolated sequence classification algorithms in Python.

Project description

Sequentia

A machine learning interface for isolated sequence classification algorithms in Python.

Introduction

Sequential data is one of the most commonly observed forms of data. These can range from time series (sequences of observations occurring through time) to non-temporal sequences such as DNA nucleotides. Time series such as audio signals and stock prices are often of particular interest as changing patterns over time naturally provide many interesting opportunities and challenges for machine learning.

This library specifically aims to tackle classification problems for isolated sequences by creating an interface to a number of classification algorithms.

Despite these types of sequences sounding very specific, you probably observe some of them on a regular basis!

Some examples of classification problems for isolated sequences include classifying:

  • a word utterance by its speech audio signal,
  • a hand-written character according to its pen-tip trajectory,
  • a hand or head gesture in a video or motion-capture recording.

Features

Sequentia offers the use of multivariate observation sequences with varying durations using the following methods:

Classification algorithms

  • Hidden Markov Models (via Pomegranate [1])
    Learning with the Baum-Welch algorithm [2]
    • Multivariate Gaussian emissions
    • Gaussian Mixture Model emissions (full and diagonal covariances)
    • Left-right and ergodic topologies
  • Approximate Dynamic Time Warping k-Nearest Neighbors (implemented with FastDTW [3])
    • Custom distance-weighted predictions
    • Multi-processed predictions


Example of a classification algorithm: a multi-class HMM isolated sequence classifier

Preprocessing methods

  • Centering, standardization and min-max scaling
  • Decimation and mean downsampling
  • Mean and median filtering

Installation

pip install sequentia

Documentation

Documentation for the package is available on Read The Docs.

Tutorials and examples

For tutorials and examples on the usage of Sequentia, look at the notebooks here.

References

[1] Jacob Schreiber. "pomegranate: Fast and Flexible Probabilistic Modeling in Python." Journal of Machine Learning Research 18 (2018), (164):1-6.
[2] Lawrence R. Rabiner. "A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition" Proceedings of the IEEE 77 (1989), no. 2, pp. 257-86.
[3] Stan Salvador, and Philip Chan. "FastDTW: Toward accurate dynamic time warping in linear time and space." Intelligent Data Analysis 11.5 (2007), 561-580.

Contributors

All contributions to this repository are greatly appreciated. Contribution guidelines can be found here.

Edwin Onuonga
Edwin Onuonga

✉️ 🌍

Sequentia © 2019-2021, Edwin Onuonga - Released under the MIT License.
Authored and maintained by Edwin Onuonga.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sequentia-0.7.2.tar.gz (37.0 kB view details)

Uploaded Source

File details

Details for the file sequentia-0.7.2.tar.gz.

File metadata

  • Download URL: sequentia-0.7.2.tar.gz
  • Upload date:
  • Size: 37.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.2.0.post20200511 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.4

File hashes

Hashes for sequentia-0.7.2.tar.gz
Algorithm Hash digest
SHA256 2e2483e7080d5e018bda4cb91fca171158206957a2faf7f7c4917e25e183658b
MD5 7bfd623a358d7b55bbc606b0f7263565
BLAKE2b-256 90210acf3d6fa003fbd64bc6a65a01676508398f3a4a76217aea295d5a76f13f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page