Skip to main content

Package for Hidden Markov and Hidden Semi-Markov Models

Project description

Chad Hidden Markov Models (ChadHMM)

GitHub license

NOTE: This package is still in its early stages, documentation might not reflect every method mentioned above, please feel free to contribute and make this more coherent.

Table of Contents

About

This repository was created as an attempt to learn and recreate the parameter estimation for Hidden Markov Models using PyTorch library. Included are models with Categorical and Gaussian emissions for both Hidden Markov Models (HMM) and Hidden Semi-Markov Models(HSMM). As en extension I am trying to include models where the parameter estimation depends on certain set of external variables, these models are referred to as Contextual HMM or Parametric/Conditional HMM where the emission probabilities/distribution parameters are influenced by the context either time dependent or independent.

The documentation on the parameter estimation and model description is captured in - now empty - docs folder. Furthermore, there are examples of the usage, especially on the financial time series, focusing on the sequence prediction but also on the possible interpretation of the model parameters.

Getting Started

This is an example of how you may give instructions on setting up your project locally. To get a local copy up and running follow these simple example steps.

  1. Install from PyPi server
    $ pip install chadhmm
    
  2. Clone the repo
    $ git clone https://github.com/GarroshIcecream/ChadHMM.git
    

Usage

Please refer to the docs for more detailed guide on how to create, train and predict sequences using Hidden Markov Models. There is also a section dedicated to visualizing the model parameters as well as its sequence predictions.

See below example of training and inference using MultinomialHMM:

from chadhmm import MultinomialHMM
from chadhmm.utilities import constraints
import torch

# Initialize Multinomial HMM with 6 states and 4 emissions
hmm = MultinomialHMM(
  n_states=6,
  n_features=4,
  n_trials=2,
  transitions=constraints.Transitions.ERGODIC
)

# Mock the example data and one hot encode
train_seq = torch.randint(0,hmm.n_features,(1000,))
one_hot = hmm.n_trials * torch.nn.functional.one_hot(train_seq,4)

# fit the model using EM algorithm assuming two sequences of lenghts 400 and 600
# also fit the model only once (n_init=1)
hmm.fit(X=one_hot,max_iter=5,lengths=[400,600],n_init=1,verbose=False)

# Compute log likelihood of generated sequence (set by_sample=False for joint log likelihood)
log_likes = hmm.score(
  one_hot,
  lengths=[400,500,100], 
  by_sample=True
)
print(log_likes)

# Compute Akeike Information criteria for each sequence (AIC, BIC or HQC)
ics = hmm.ic(
  one_hot,
  lengths=[400,500,100],
  criterion=constraints.InformCriteria.AIC
)
print(ics)

# Get the most likely sequence using Viterbi algorithm (MAP also available)
viterbi_path = hmm.predict(
  one_hot,
  lengths=[400,500,100],
  algorithm='viterbi'
)
print(viterbi_path)

Roadmap

  • Hidden Semi Markov Models
    • Fix computation of posteriors
    • Implementation of Viterbi algorithm for HSMM
  • Integration of contextual models
    • Time dependent context
    • Contextual Variables for covariances using GEM (Generalized Expectation Maximization algorithm)
    • Contextual variables for Multinomial emissions
  • Use Wrapped Distributions instead of Tensors with parameters - instead of Tensor with logits use Categorical distribution
    • Implement different types of covariance matrices
    • Connect that with degrees of freedom
  • Improve the docs with examples
    • Application on financial time series prediction
  • Support for CUDA training
  • Support different types of Transition Matrices - semi, left-to-right and ergodic
  • Support for wider range of emissions distributions
  • K-Means for Gaussian means initialization
  • Code base refactor, abstractions might be confusing

See the open issues for a full list of proposed features (and known issues).

Unit Tests

If you want to run the unit tests, execute the following command:

$ make tests

References

Implementations are based on:

License

This project is licensed under the MIT License - see the LICENSE file for details.

Buy Me A Coffee

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chadhmm-0.3.10.tar.gz (25.1 kB view details)

Uploaded Source

Built Distribution

chadhmm-0.3.10-py3-none-any.whl (30.4 kB view details)

Uploaded Python 3

File details

Details for the file chadhmm-0.3.10.tar.gz.

File metadata

  • Download URL: chadhmm-0.3.10.tar.gz
  • Upload date:
  • Size: 25.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for chadhmm-0.3.10.tar.gz
Algorithm Hash digest
SHA256 b6c986966fee077e180a178b1dd95f16d764a15f99d554ef41cb92bfdbcb0c7b
MD5 a129f464770a874759ee5554cf4ba2ba
BLAKE2b-256 a54a87f5709e1e07c17374fee6b80c4f47f6cfe756beda2396b2be3a5e5a6ae4

See more details on using hashes here.

File details

Details for the file chadhmm-0.3.10-py3-none-any.whl.

File metadata

  • Download URL: chadhmm-0.3.10-py3-none-any.whl
  • Upload date:
  • Size: 30.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for chadhmm-0.3.10-py3-none-any.whl
Algorithm Hash digest
SHA256 6089bdc7f60e0b68c67f0c90f4919227cf5b31ee271760ec88ef60fcde9721e1
MD5 8c8f017f96cd60148af7a88b7a5c3b21
BLAKE2b-256 41efb733986e5e60446a962543dac1505369fea35f248d762d92da1b590c3347

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page