Skip to main content

simple nlp pipeline

Project description

SWT-NLP PACKAGE

KEYTERM EXTRACTION

install package

pip install swt-nlp

DEMO

load demo content

from swt.nlp.basis import keyterm_extractor
from tests.keyterm_extraction.test_keyterm_extraction_fit_modules import Mockup

# corpus in format list of plain text
small_content = Mockup.small_corpus()
# small_content[:5] = [
# 'อยากกระโดดน้ำที่แม่น้ำโขง',
# 'แม่น้ำที่จังหวัดกาญจนบุรีนี่สุดยอดมาก',
# 'เหล้ามีหลายยี่ห้อ แสงโสม แม่น้ำโขงหรืออะไรก็มีหมดเลย',
# 'ข้อความนี้เกี่ยวกับชิมช็อปใช้',
# 'รัฐบาลผลักดันชิมช็อปใช้มากขึ้น']

# extract new terms
kt = keyterm_extractor()
# - in case of using a custom tokenizer
# - this example is using word_tokenizer of pythainlp with keep_whitespace=False setting
# custom_tokenizer = lambda t: word_tokenize(t, keep_whitespace=False)  # your own callable tokenizer function 
# kt = keyterm_extractor(tokenizer=custom_tokenizer)
kt.fit(small_content)
new_terms = kt.extract()
# new_terms = ['ชิมช็อปใช้', 'แม่น้ำโขง']

HOW TO BUILD A PACKAGE TO PYPI

prerequisite

pip install setuptools wheel tqdm twine

build and upload package

# preparing tar.gz package 
python setup.py sdist
# uploading package to pypi server
python -m twine upload dist/{package.tar.gz}  --verbose

install package

# install latest version
pip install swt-nlp --upgrade
# specific version with no cache
pip install swt-nlp==0.0.11  --no-cache-dir

install package by wheel

# build wheel 
python setup.py bdist_wheel

# install package by wheel 
# use --force-reinstall if needed
pip install dist/{package.whl}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

swt-nlp-0.0.53.tar.gz (25.4 kB view details)

Uploaded Source

File details

Details for the file swt-nlp-0.0.53.tar.gz.

File metadata

  • Download URL: swt-nlp-0.0.53.tar.gz
  • Upload date:
  • Size: 25.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.7

File hashes

Hashes for swt-nlp-0.0.53.tar.gz
Algorithm Hash digest
SHA256 eb7e4a0648873e42ac78c9544051571176a89b182f29b7273ced5a347a3b3769
MD5 f5deac1683ea6a7fb3b60eba3fe8713a
BLAKE2b-256 889caf480458bcfb5fcc5a4d15886aad214d263711b8ae016de0efb4a45f277b

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page