Skip to main content

The Uzbek Natural Language Toolkit (NLTK) is a Python package for natural language processing.

Project description

uznltk

uznltk — is a lightweight and convenient NLP (Natural Language Processing) library for the Uzbek language. It includes text cleaning, morphological analysis, number and text conversions, syllable splitting, and many other functions.

🔗 Links

👤 Authors

🔧 Install

pip install uznltk

🚀 Usage

from uznltk import *

📚 Functions

clean_text(text)

Corrects characters specific to the Uzbek language (g', o', ( ’ )).

clean_text("O'zbekistonda ta'lim kuchli rivojlanmoqda")
# Result: "O‘zbekistonda ta’lim kuchli rivojlanmoqda"

solid_sign(text)

Returns words with a ( ’ ) character as a list.

solid_sign("ta'lim bo'lishi oldindan ma'lum edi")
# Result: ['ta’lim', 'ma’lum']

lemmatize(text) and stem_word(text)

Extracts the stem of a word.

lemmatize("mexanizatorlashtirilganlardan")
# Result: "mexanizatorlashtirilgan"

number_to_text(number)

Converts a number to Uzbek text.

number_to_text(54)
# Result: "ellik to‘rt"

text_to_number(text)

Converts a number in text to numeric form.

text_to_number("yetmish olti")
# Result: 76

download(name)

Downloads various resources (e.g. books, news).

download("book")

clean_stopword(text)

Removes stop words from the text.

clean_stopword("salom dunyo, biz sen va u bilan bugun maktabga bordik")
# Result: "salom dunyo, bugun maktabga bordik"

syllables(text)

Divides words into syllables.

syllables("Bizga ma’lum ishlar yuz bermoqda!")
# Result: ['Biz-ga', 'ma’-lum', 'ish-lar', 'yuz', 'ber-moq-da!']

hyphenation(text)

Each word is divided into syllables and presented in a list.

hyphenation("salom dunyo")
# Result: ['sa-lom dunyo', 'salom dun-yo']

count_syllable(text)

Counts the number of syllables in the text.

count_syllable("Salom Dunyo")
# Result: 4

count_text(text)

Counts the number of words in the text.

count_text("Salom Dunyo")
# Result: 2

split_sentences(text)

Sorts the sentences in the text into lists.

split_sentences("Salom Dunyo. Bugun ob-havo qisman bulutli")
# Result: ['Salom Dunyo', 'Bugun ob-havo qisman bulutli']

split_words(text)

Extracts only words from the text (without IP, email, emoji, URLs) into a list.

split_words("sen 192.168.1.18 va helloworld@example.com elektron manzilidasan. Manba https://pypi.org")
# Result: ['sen', 'va', 'elektron', 'manzilidasan', 'Manba']

💡 Information

  • The library is entirely designed for the Uzbek language.
  • It includes basic NLP components such as number processing, lemmatization, and syntacticization.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

uznltk-0.0.14.tar.gz (9.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

uznltk-0.0.14-py3-none-any.whl (9.2 kB view details)

Uploaded Python 3

File details

Details for the file uznltk-0.0.14.tar.gz.

File metadata

  • Download URL: uznltk-0.0.14.tar.gz
  • Upload date:
  • Size: 9.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.10

File hashes

Hashes for uznltk-0.0.14.tar.gz
Algorithm Hash digest
SHA256 0d17694e5f14211953f9bcdd405d467621558f6aab41ea114a8f9dd19ce5e8bb
MD5 e94b0cd5d6a4feed78d1fbf966b4b7a3
BLAKE2b-256 35210321ee61cb70fbf3c835169c0d6c462fea3f063ac417e7980a8ecf15fa70

See more details on using hashes here.

File details

Details for the file uznltk-0.0.14-py3-none-any.whl.

File metadata

  • Download URL: uznltk-0.0.14-py3-none-any.whl
  • Upload date:
  • Size: 9.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.10

File hashes

Hashes for uznltk-0.0.14-py3-none-any.whl
Algorithm Hash digest
SHA256 dbdc87f1079448cb8d170cfbc2168c52840fa702d577617bc3854c94788bda20
MD5 948f6cc4002917bb5c1ff2114d1cf67d
BLAKE2b-256 0b119a96831b838d28655ae26fc0ebc896b4d8bc2b342b9a7cb0e3c9c0fe82ab

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page