Skip to main content

uztagger | Uzbek Morphological Part of Speech (POS) Tagging on Python

Project description

uztagger

https://pypi.org/project/uztagger
https://github.com/UlugbekSalaev/uztagger

uztagger is focused to make tagging sentence with morphological Part of Speech (POS) tagset of Uzbek word based on morphemes and limited number of lexicon. The tool includes list of POS tagset, tagging method. It is created as a python library and uploaded to PyPI. It is simply easy to use in your python project or other programming language projects via the API.

About project

The tool is focused to make tagging sentence with morphological Part of Speech (POS) tagset of Uzbek word based on morphemes. The tool includes list of POS tagset, tagging method.

Quick links

Demo

You can use web interface.

Features

  • Tagging
  • POS tag list
  • Help function

Usage

Three options to run uztagger:

  • pip
  • API
  • Web interface

pip installation

To install uztagger, simply run:

pip install uztagger

After installation, use in python like following:

# import the library
from uztagger import Tagger
# create an object 
tagger = Tagger()
# call tagging method
tagger.pos_tag('Bizlar bugun maktabga bormoqchimiz.')
# output
[('Bizlar','NOUN'),('bugun', 'NOUN'), ('maktabga', 'NOUN'), ('bormoqchimiz', 'VERB'), ('.', 'PUNC')]

API

API configurations:

  • Method: GET
  • Response type: string
  • URL: https://nlp.urdu.uz:8080/uztagger/pos_tag
    • Parameters: text:string
  • Sample Request: https://nlp.urdu.uz:8080/uztagger/pos_tag?text=Ular%20maktabga%20borayaptilar.
  • Sample output: [("Ular","NOUN"),("maktabga",""),("borayaptilar",""),(".","PUNC")]

Web-UI

The web interface created to use easily the library: You can use web interface here.

Demo image

POS tag list

Tagger using following options as POS tag:
NOUN Noun {Ot}
VERB Verb {Fe'l}
ADJ Adjective {Sifat}
NUM Numeric {Son}
ADV Adverb {Ravish}
PRN Pronoun {Olmosh}
CNJ Conjunction {Bog'lovchi}
ADP Adposition {Ko'makchi}
PRT Particle {Yuklama}
INTJ Interjection {Undov}
MOD Modal {Modal}
IMIT Imitation {Taqlid}
AUX Auxiliary verb {Yordamchi fe'l}
PPN Proper noun {Atoqli ot}
PUNC Punctuation {Tinish belgi}
SYM Symbol {Belgi}

Result Explaining

The method pos_tag returns list, that an item of the list contain tuples for each token of the text with following format: (token, pos), for POS tag list, see POS Tag List section on above.

Result from tagger method

[('Bizlar','NOUN'),('bugun', 'NOUN'), ('maktabga', 'NOUN'), ('bormoqchimiz', 'VERB'), ('.', 'PUNC')]

Documentation

See here.

Citation

@misc{uztagger,
  title={{uztagger}: Morphological Part of Speech Tagger Tool for Uzbek},
  url={https://github.com/UlugbekSalaev/uztagger},
  note={Software available from https://github.com/UlugbekSalaev/uztagger},
  author={
    Ulugbek Salaev},
  year={2022},
}

Contact

For help and feedback, please feel free to contact the author.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

uztagger-0.0.11.tar.gz (353.3 kB view details)

Uploaded Source

Built Distribution

uztagger-0.0.11-py3-none-any.whl (351.3 kB view details)

Uploaded Python 3

File details

Details for the file uztagger-0.0.11.tar.gz.

File metadata

  • Download URL: uztagger-0.0.11.tar.gz
  • Upload date:
  • Size: 353.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.12.0

File hashes

Hashes for uztagger-0.0.11.tar.gz
Algorithm Hash digest
SHA256 c0d6f78b7df271afe4483511256c8f9fd83f15e413cff822ffa2a2dd9b177f93
MD5 30753433b6db692197b100f792b98466
BLAKE2b-256 0c75a7a2ea66afddd2965d94ac993a40628c48f04c047635d2d987d07bd83464

See more details on using hashes here.

File details

Details for the file uztagger-0.0.11-py3-none-any.whl.

File metadata

  • Download URL: uztagger-0.0.11-py3-none-any.whl
  • Upload date:
  • Size: 351.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.12.0

File hashes

Hashes for uztagger-0.0.11-py3-none-any.whl
Algorithm Hash digest
SHA256 9c3f10f8999f87a6d3de1bdf1dde3023f057991b9e3c985b1742b6f453887f74
MD5 12a90e7d361f6785bdb49efa1e5ced67
BLAKE2b-256 0ded8d631109da25cdb5eb11b5c41e5b25e62860fe260e76eb2fdaa8b364c38d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page