Finds the lemma of Uzbek words
Project description
Authors
Author1: Maksudbek
Author2: Dasturbek
Lemma & Lemmatization
The package finds lemmas of Uzbek words based on the dictionary.
The process of finding a lemma is called lemmatization.
There are 4 different ways of lemmatization: rule, dictionary, model, hybrid.
It is dictionary-based lemmatization algorithm [program, package].
Install & Clone
pip install UzbekLemma
git clone https://github.com/ddasturbek/UzbekLemma.git
Usage
import UzbekLemma as UL
print(UL.lemmatize("kelganlar")) #kelmoq
The algorithm flowchart
The dictionary structure
Scientific field
Patent
Some results of the program
Corpus & Results
We collected an equal number of texts from 23 different fields and stored them as a corpus.
We tested all the files (i.e. corpora) in the program and got these results.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
uzbeklemma-1.0.2.tar.gz
(4.2 kB
view hashes)
Built Distribution
Close
Hashes for UzbekLemma-1.0.2-py3-none-any.whl
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 | e78bad5dbabd276807bc5079a48675432ec8c76c38acaf6cdb23bd95e22a9037 |
|
| MD5 | 6cdaf9098d03a83cb25d7ab5ba02e1f3 |
|
| BLAKE2b-256 | 8943a39a4839dd364850ec883392c2260aa9bba9e33f74a08c16e8ab58b68844 |