A part of speech tagger based on Hidden Markov models
Project description
PyPOS - Python Part-of-Speech tagger
This is a project, which allows its users to assign part of speech tags to words in a sentence .
dt vbz dt nn , wdt vbz prp$ nns to vb nn in nn nns to nns in dt nn .
PyPOS uses Hidden Markov Models and Viterbi decoding to determine the most likely sequence of POS tags for a given sequence of words.
Usage
Installation
Requires Python 3.6 or higher
pip3 install pypos
Training
from pypos import PartOfSpeechTagger, PartOfSpeechDataset
tagger = PartOfSpeechTagger()
ds = PartOfSpeechDataset.load('train.txt')
tagger.fit(ds).save('tagger.p')
Tagging
from pypos import PartOfSpeechTagger
tagger = PartOfSpeechTagger.load('tagger.p')
# Reproducing the results shown above:
sentence = 'This is a project, which allows its users to assign part of speech tags to words in a sentence.'
tokens = tagger.tokenize(sentence)
tags = tagger.tag(sentence, human_readable=False)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pypos-0.1.2.tar.gz
(3.7 kB
view details)
Built Distribution
pypos-0.1.2-py3-none-any.whl
(5.1 kB
view details)
File details
Details for the file pypos-0.1.2.tar.gz
.
File metadata
- Download URL: pypos-0.1.2.tar.gz
- Upload date:
- Size: 3.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.18.4 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.23.3 CPython/3.6.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d5acb4f4968f88877a8626b8cee877d2b3a56fca57987cf08da1346cc540a1cd |
|
MD5 | 6e39ef0e2c2581e3ff9af5c344814b0f |
|
BLAKE2b-256 | 9fda9b94d9e0926dc960107492cc29c023a7aeebba326a574aa4816e351ce183 |
File details
Details for the file pypos-0.1.2-py3-none-any.whl
.
File metadata
- Download URL: pypos-0.1.2-py3-none-any.whl
- Upload date:
- Size: 5.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.18.4 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.23.3 CPython/3.6.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 062344d6354f88fe2f3e3301803e6eb6402482205efc46783752b597f24f6b75 |
|
MD5 | d2e09b7f25c554d9241392b7613bf431 |
|
BLAKE2b-256 | dac224a8f9e47d0f2e03a1e68afb75b3660dcc58622e1bb9069d3c8bad1b32bd |