UzLemmatizer: A Stemmer and Lemmatizer Tool for Uzbek Language
Project description
UzLemmatizer
https://pypi.org/project/UzLemmatizer
https://github.com/UlugbekSalaev/UzLemmatizer
UzLemmatizer tool is focused to identify a stem and lemma of Uzbek words with its POS tag based on a morphemes. It is created as a python library and uploaded to PyPI. It is simply easy to use in your python project or other programming language projects via the API.
About project
UzLemmatizer project involves Uzbek word morphology, which is the study of word forms. The tool is focused to extract stem and lemma of Uzbek word based on morphemes. Additionally, the result contain a predicted POS tag of the given token.
Quick links
Demo
You can use web interface.
Features
- Stemmer
- Lemmatizer
- Lemmatizer with POS tag
- Extract Morphemes list
- Predict POS tag
Usage
Three options to run UzLemmatizer:
- pip
- API
- Web interface
pip installation
To install UzLemmatizer, simply run:
pip install UzLemmatizer
After installation, use in python like following:
# import the library
from UzLemmatizer import UzLemmatizer
# create an object
analyzer = UzLemmatizer.UzLemmatizer()
# call stem method
analyzer.stem('maktabimda')
# call lemmatize method
analyzer.lemmatize('maktabimda')
# call lemmatize method with POS tag
analyzer.lemmatize('maktabimda', analyzer.POS.NOUN)
API
API configurations:
-
Method: GET
-
Response type:
string
-
URL: https://uz-translit.herokuapp.com/stem
- Parameters:
word:string
- Sample Request: https://uztranslit.herokuapp.com/stem?word=maktabimda
- Parameters:
-
https://uz-translit.herokuapp.com/lemmatize
- Parameters:
word:string
,pos:string
- Sample Request: https://uztranslit.herokuapp.com/lemmatize?word=maktabimda&pos=NOUN
- Parameters:
Web-UI
The web interface created to use easily the library: You can use web interface here.
Options
When you use PyPI or API, you should use following options as POS tag of a word which is optional parameter of lemmatize()
metods:
NOUN
Noun
VERB
Verb
ADJ
Adjective
NUM
Numerical
PRN
Pronoun
ADV
Adverb
pos
parameters is optional for lemmatize
metods.
Result Explaining
It returns single word in a string type from each method, stem
and lemmatize
, that is stem and lemma of given word, respectively.
Documentation
See here.
Citation
@misc{UzLemmatizer,
title={{UzLemmatizer}: Stemmer and Lemmatizer Tool for Uzbek Language},
url={https://github.com/UlugbekSalaev/UzLemmatizer},
note={Software available from https://github.com/UlugbekSalaev/UzLemmatizer},
author={
Ulugbek Salaev},
year={2022},
}
Contact
For help and feedback, please feel free to contact the author.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file UzLemmatizer-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: UzLemmatizer-0.0.1-py3-none-any.whl
- Upload date:
- Size: 18.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.10.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 26e47be5f543854e1d922c1144fc1279026ee841ddae41a001f6c50901297589 |
|
MD5 | 43d1a007bd5405a419de0281bae078d5 |
|
BLAKE2b-256 | 1ad37f87affc63131e5cbb9637c5a3e8e8808724113902535c577b1f5ad46bed |