Skip to main content

A Nepali language processing library

Project description

NepaliKit

NepaliKit is a Python library for natural language processing tasks in the Nepali language.

Installation

You can install NepaliKit using pip:

pip install NepaliKit

Features

NepaliKit provides the following features:

  • Tokenization: Tokenize Nepali text using the SentencePiece tokenizer.
  • Preprocessing: Clean and preprocess Nepali text data, including removing HTML tags, special characters, and other cleaning tasks.
  • Stopword Management: Load, add, and remove stopwords from Nepali text.
  • Sentence Operations: Segment Nepali text into sentences based on punctuation marks.
  • SentencePiece Model Training: Train custom SentencePiece models for Nepali text data.
  • Utility Functions: Various utility functions for text processing and manipulation.
  • Integration with PyTorch: Utilities for integrating with PyTorch for machine learning tasks.

Usage

Tokenization Example

from NepaliKit.tokenization import SentencePieceTokenizer

text = "नमस्ते, के छ खबर?"
tokenizer = SentencePieceTokenizer()
tokens = tokenizer.tokenize(text)
print(tokens)

Preprocessing Example

from NepaliKit.preprocessing import remove_html_tags, remove_special_characters

text = "<p>नमस्ते, के छ खबर?</p>"
clean_text = remove_html_tags(text)
clean_text = remove_special_characters(clean_text)
print(clean_text)

Stopword Example

from NepaliKit.manage_stopwords import load_stopwords, add_stopword, remove_stopword

stopwords = load_stopwords('/path/to/stopword/directory')
add_stopword('नयाँ_स्टापवर्ड')
remove_stopword('कुनै_स्टापवर्ड')

License

This project is licensed under the MIT License.

Author

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nepalikit-1.0.1.tar.gz (5.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

nepalikit-1.0.1-py3-none-any.whl (232.6 kB view details)

Uploaded Python 3

NepaliKit-1.0.1-py3-none-any.whl (7.5 kB view details)

Uploaded Python 3

File details

Details for the file nepalikit-1.0.1.tar.gz.

File metadata

  • Download URL: nepalikit-1.0.1.tar.gz
  • Upload date:
  • Size: 5.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.4

File hashes

Hashes for nepalikit-1.0.1.tar.gz
Algorithm Hash digest
SHA256 a27daac86ed7c80618f407bd2f0d9caa7eb60b252bdfae06537602c2e5c6e08e
MD5 b4401b211a1ac11e48a60a9b7e237a3f
BLAKE2b-256 115dce5c7d5cb6cf4f5d02dadf3c297ec426a297de2c06b2b144eb2d286b8369

See more details on using hashes here.

File details

Details for the file nepalikit-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: nepalikit-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 232.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.4

File hashes

Hashes for nepalikit-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 bece9ddef0e3a29eb4313335cb91844d386d28e11681022db9825ba7e38eb025
MD5 45accac5df0ea50f8f0c2ecd94246196
BLAKE2b-256 ff3cb4369d688a7e477c141f2122161299b11919e1a747018390f7e16912200c

See more details on using hashes here.

File details

Details for the file NepaliKit-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: NepaliKit-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 7.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.4

File hashes

Hashes for NepaliKit-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 be77cf5f0eea2c9af411379d67888b226fe4217bf5ea65f136ff8072045a4eb0
MD5 57d8e4237ac9054c2a607838c9ea46a0
BLAKE2b-256 065a9eb245aa787b0e60e1e29c008076a16d9b83b5db8390ef6622826fb83847

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page