Skip to main content

A python package for tokenizing and normalizing texts mainly written in the Igbo Language

Project description

What Is Igbo Text

Igbo Text is a library for tokenizing and normalizing texts mainly written in the Igbo Language.
It is an implementation of the tokenization and normalization algorithm written in the paper Analysis and Representation of Igbo Text Document for a Text-Based System by Ifeanyi-Reuben Nkechi J., Ugwu Chidiebere, Adegbola Tunde.

Installation

$ pip install igbo-text

Examples

Normalization

from igbo_text import IgboText

# Create IgboText class instance
igbo_text = IgboText()

# normalize text 
text = "Ọ nà-ezò nnukwu mmīri n'iro?"
normalized_text = igbo_text.normalize(text, convert_to_lower=True, remove_abbreviations=True)
print(normalized_text)

When the code above is executed, the output will be

ọ na ezo nnukwu mmiri n iro

Upper case characters can be left alone by setting convert_to_lower=False

Abbreviations can be left alone by setting remove_abbreviations=True

Tokenization

from igbo_text import IgboText

# Create IgboText class instance
igbo_text = IgboText()

# tokenize text
text = "Ndị Fàda kwènyèrè n'atọ̀ n'ime otù."
tokenized_text = igbo_text.tokenize(text)
print(tokenized_text)

When the code above isi executed, the output will be

["Ndị", "Fada", "kwenyere", "n'", "atọ", "n'", "ime", "otu", "."]

You can convert all upper case characters to lower case by setting convert_to_lower=True.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

igbo-text-0.1.3.tar.gz (3.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

igbo_text-0.1.3-py3-none-any.whl (5.5 kB view details)

Uploaded Python 3

File details

Details for the file igbo-text-0.1.3.tar.gz.

File metadata

  • Download URL: igbo-text-0.1.3.tar.gz
  • Upload date:
  • Size: 3.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.18.4 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.5

File hashes

Hashes for igbo-text-0.1.3.tar.gz
Algorithm Hash digest
SHA256 132defe3abf7b36ef1fb9da296489cb65f687c439ce1e27ea018f0046801ecb4
MD5 bf364fad46c33374e0418b93d5e4eda1
BLAKE2b-256 8f04f9a6d3e8a1bad09a1bc5631c86bf3c541fca6c3fc2aa0db3d7d148e294d6

See more details on using hashes here.

File details

Details for the file igbo_text-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: igbo_text-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 5.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.18.4 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.5

File hashes

Hashes for igbo_text-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 837a59d0f3544b348424f966c97523122f4b368ca31f86252e131f0dd34a9438
MD5 add5cedbaa2993305c2694289ed018dd
BLAKE2b-256 b845ff070e7cc82f33eba92f9256567bf37ac6e8e260f982265e43691d4061d4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page