A part of speech tagger based on Hidden Markov models
Project description
PyPOS - Python Part-of-Speech tagger
This is a project, which allows its users to assign part of speech tags to words in a sentence .
dt vbz dt nn , wdt vbz prp$ nns to vb nn in nn nns to nns in dt nn .
PyPOS uses Hidden Markov Models and Viterbi decoding to determine the most likely sequence of POS tags for a given sequence of words.
Usage
Installation
Requires Python 3.6 or higher
pip3 install pypos
Training
from pypos import PartOfSpeechTagger, PartOfSpeechDataset tagger = PartOfSpeechTagger() ds = PartOfSpeechDataset.load('train.txt') tagger.fit(ds).save('tagger.p')
Tagging
from pypos import PartOfSpeechTagger tagger = PartOfSpeechTagger.load('tagger.p') # Reproducing the results shown above: sentence = 'This is a project, which allows its users to assign part of speech tags to words in a sentence.' tokens = tagger.tokenize(sentence) tags = tagger.tag(sentence, human_readable=False)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Filename, size | File type | Python version | Upload date | Hashes |
---|---|---|---|---|
Filename, size pypos-0.1.2-py3-none-any.whl (5.1 kB) | File type Wheel | Python version py3 | Upload date | Hashes View |
Filename, size pypos-0.1.2.tar.gz (3.7 kB) | File type Source | Python version None | Upload date | Hashes View |