Skip to main content

A Nepali language processing library

Project description

NepaliKit

NepaliKit is a Python library for natural language processing tasks in the Nepali language.

Installation

You can install NepaliKit using pip:

.. code-block:: bash

pip install NepaliKit

Features

NepaliKit provides the following features:

  • Tokenization: Tokenize Nepali text using the SentencePiece tokenizer.
  • Preprocessing: Clean and preprocess Nepali text data, including removing HTML tags, special characters, and other cleaning tasks.
  • Stopword Management: Load, add, and remove stopwords from Nepali text.
  • Sentence Operations: Segment Nepali text into sentences based on punctuation marks.
  • SentencePiece Model Training: Train custom SentencePiece models for Nepali text data.
  • Utility Functions: Various utility functions for text processing and manipulation.
  • Integration with PyTorch: Utilities for integrating with PyTorch for machine learning tasks.

Usage

Tokenization Example

from NepaliKit.tokenization import SentencePieceTokenizer

text = "नमस्ते, के छ खबर?"
tokenizer = SentencePieceTokenizer()
tokens = tokenizer.tokenize(text)
print(tokens)

Preprocessing Example

from NepaliKit.preprocessing import remove_html_tags, remove_special_characters

text = "<p>नमस्ते, के छ खबर?</p>"
clean_text = remove_html_tags(text)
clean_text = remove_special_characters(clean_text)
print(clean_text)

Stopword Example

from NepaliKit.manage_stopwords import load_stopwords, add_stopword, remove_stopword

stopwords = load_stopwords('/path/to/stopword/directory')
add_stopword('नयाँ_स्टापवर्ड')
remove_stopword('कुनै_स्टापवर्ड')

License

This project is licensed under the MIT License.

Author

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nepalikit-1.0.tar.gz (4.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

NepaliKit-1.0.0-py3-none-any.whl (7.5 kB view details)

Uploaded Python 3

NepaliKit-1.0-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file nepalikit-1.0.tar.gz.

File metadata

  • Download URL: nepalikit-1.0.tar.gz
  • Upload date:
  • Size: 4.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.4

File hashes

Hashes for nepalikit-1.0.tar.gz
Algorithm Hash digest
SHA256 b454395a044c0124b7060cef21f74822bf298a20c762732ca96364e1aa5dfa35
MD5 e5b3a2e52fe9de777eb5ed13940764ca
BLAKE2b-256 41243f3b6192dead6c0e8f888b0b8271045c06ccfcc4bd847159c033ad4e1647

See more details on using hashes here.

File details

Details for the file NepaliKit-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: NepaliKit-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 7.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.4

File hashes

Hashes for NepaliKit-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 90367aa5a80e18716e589b32e5fb47e63e8eeac6a24e31c309bf39310198b902
MD5 9e6a2d7ff1c5ce3724164bba1c821383
BLAKE2b-256 3227a003821cb9cfc9e1d163b831208373b0141a47213459c4d46c66aafcb53a

See more details on using hashes here.

File details

Details for the file NepaliKit-1.0-py3-none-any.whl.

File metadata

  • Download URL: NepaliKit-1.0-py3-none-any.whl
  • Upload date:
  • Size: 7.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.4

File hashes

Hashes for NepaliKit-1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9a967af5edf205870bc7d12c7890185180ead3b65382f5c1f9a2a602375e48c7
MD5 daf2248cb0d3fa6d79449c02a1175d11
BLAKE2b-256 53a508c67ab4940aca8fc7d38f509110cafdfb83ac550d03b5c7ab60a0b313b7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page