Skip to main content

uztagger | Uzbek Morphological Part of Speech (POS) Tagging on Python

Project description

uztagger

https://pypi.org/project/uztagger
https://github.com/UlugbekSalaev/uztagger

uztagger is focused to make tagging sentence with morphological Part of Speech (POS) tagset of Uzbek word based on morphemes and limited number of lexicon. The tool includes list of POS tagset, tagging method. It is created as a python library and uploaded to PyPI. It is simply easy to use in your python project or other programming language projects via the API.

About project

The tool is focused to make tagging sentence with morphological Part of Speech (POS) tagset of Uzbek word based on morphemes. The tool includes list of POS tagset, tagging method.

Quick links

Demo

You can use web interface.

Features

  • Tagging
  • POS tag list
  • Help function

Usage

Three options to run uztagger:

  • pip
  • API
  • Web interface

pip installation

To install uztagger, simply run:

pip install uztagger

After installation, use in python like following:

# import the library
from uztagger import Tagger
# create an object 
tagger = Tagger()
# call tagging method
tagger.pos_tag('Bizlar bugun maktabga bormoqchimiz.')
# output
[('Bizlar','NOUN'),('bugun', 'NOUN'), ('maktabga', 'NOUN'), ('bormoqchimiz', 'VERB'), ('.', 'PUNC')]

API

API configurations:

  • Method: GET
  • Response type: string
  • URL: https://uzbeknlp.herokuapp.com/postagging
    • Parameters: text:string
  • Sample Request: https://uzbeknlp.herokuapp.com/postagging?text=Ular%20maktabga%20borayaptilar.
  • Sample output: [("Ular","NOUN"),("maktabga",""),("borayaptilar",""),(".","PUNC")]

Web-UI

The web interface created to use easily the library: You can use web interface here.

Demo image

POS tag list

Tagger using following options as POS tag:
NOUN Noun
VERB Verb
ADJ Adjective
NUM Numeric
ADV Adverb
PRN Pronoun
CNJ Conjunction
ADP Adposition
PRT Particle
INTJ Interjection
MOD Modal
IMIT Imitation
AUX Auxiliary verb
PPN Proper noun
PUNC Punctuation
SYM Symbol

Result Explaining

The method pos_tag returns list, that an item of the list contain tuples for each token of the text with following format: (token, pos), for POS tag list, see POS Tag List section on above.

Result from tagger method

[('Bizlar','NOUN'),('bugun', 'NOUN'), ('maktabga', 'NOUN'), ('bormoqchimiz', 'VERB'), ('.', 'PUNC')]

Documentation

See here.

Citation

@misc{uztagger,
  title={{uztagger}: Morphological Part of Speech Tagger Tool for Uzbek},
  url={https://github.com/UlugbekSalaev/uztagger},
  note={Software available from https://github.com/UlugbekSalaev/uztagger},
  author={
    Ulugbek Salaev},
  year={2022},
}

Contact

For help and feedback, please feel free to contact the author.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

uztagger-0.0.7-py3-none-any.whl (351.0 kB view details)

Uploaded Python 3

File details

Details for the file uztagger-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: uztagger-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 351.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.12

File hashes

Hashes for uztagger-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 e25efdfb2ca0b9e1488be5c9822b912a7fa6faf9f8be0fd03e9954c7a8a8ab06
MD5 8a98f52c1078548451430cd076eb7538
BLAKE2b-256 54b3b82278a1a5889eaca97dae2ed9815af190159b7f496f1be804299b5de5f3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page