Skip to main content

A part of speech tagger based on Hidden Markov models

Project description

PyPOS - Python Part-of-Speech tagger

This is  a  project, which allows its  users to assign part of speech tags to words in a  sentence .
dt   vbz dt nn     , wdt   vbz    prp$ nns   to vb     nn   in nn     nns  to nns   in dt nn       .

PyPOS uses Hidden Markov Models and Viterbi decoding to determine the most likely sequence of POS tags for a given sequence of words.

Usage

Installation

Requires Python 3.6 or higher

pip3 install pypos

Training

from pypos import PartOfSpeechTagger, PartOfSpeechDataset
tagger = PartOfSpeechTagger()
ds = PartOfSpeechDataset.load('train.txt')
tagger.fit(ds).save('tagger.p')

Tagging

from pypos import PartOfSpeechTagger
tagger = PartOfSpeechTagger.load('tagger.p')

# Reproducing the results shown above:
sentence = 'This is a project, which allows its users to assign part of speech tags to words in a sentence.'
tokens = tagger.tokenize(sentence)
tags = tagger.tag(sentence, human_readable=False)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for pypos, version 0.1.2
Filename, size File type Python version Upload date Hashes
Filename, size pypos-0.1.2-py3-none-any.whl (5.1 kB) File type Wheel Python version py3 Upload date Hashes View hashes
Filename, size pypos-0.1.2.tar.gz (3.7 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page