Skip to main content

A python package for HMM model with fast train and decoding implementation

Project description

中文版本的 README

FastHMM

A python package for HMM (Hidden Markov Model) model with fast train and decoding implementation

Python version

test by using Python3

Install

pip

pip install FastHMM

source

pip install git+https://github.com/312shan/FastHMM.git

Usage

from FastHMM.hmm import HMMModel

# test model training and predict
hmm_model = HMMModel()
hmm_model.train_one_line([("我", "r"), ("爱", "v"), ("北京", "ns"), ("天安门", "ns")])
hmm_model.train_one_line([("你", "r"), ("去", "v"), ("深圳", "ns")])
result = hmm_model.predict(["俺", "爱", "广州"])
print(result)

# test save and load model
hmm_model.save_model()
hmm_model = HMMModel().load_model()
result = hmm_model.predict(["我们", "爱", "深圳"])
print(result)

Output:

[('俺', 'r'), ('爱', 'v'), ('广州', 'ns')]
[('我们', 'r'), ('爱', 'v'), ('深圳', 'ns')]

Performance:

test on dataset 人民日报

python .test/test_postagging.py

Output:

train size 18484 ,test_size 1000
finish training
eval result: 
predict 57929 tags, 54228 correct,  accuracy 0.9361114467710473
runtime : 370.1029086 seconds

Most of time the consuming is on the decoding stage, I tried many ways to implement viterbi algorithm, The implementation I currently use is the fastest If you have suggestions for improving this decoding algorithm, please let me know, thank you very much.

Reference

MicroHMM
Hidden Markov model
Viterbi algorithm

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for FastHMM, version 0.1.2
Filename, size File type Python version Upload date Hashes
Filename, size FastHMM-0.1.2-py3-none-any.whl (6.8 kB) File type Wheel Python version py3 Upload date Hashes View hashes
Filename, size FastHMM-0.1.2.tar.gz (5.8 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page