Set of whole-word (independent) stop words in Korean.
Project description
ko-ww-stopwords
This is a set of whole-word (independent) stop words in Korean. Dependent stop words, on the other hand, are difficult to identify without using a part-of-speech tagger, but it is easy to identify whole-word (independent) stop words.
Code Sample
from ko_ww_stopwords.stop_words import ko_ww_stop_words from ko_ww_stopwords.tools import is_stop_word, strip_outer_punct
print(ko_ww_stop_words)
#is_stop_word(word) #Returns true if word is a whole-word stop word.
print("우선 is_stop_word -> {}".format(is_stop_word("우선")))
print("서울 is_stop_word -> {}".format(is_stop_word("서울")))
#strip_outer_punct(word) #Strips leading and trailing punctuation marks from word.
raw_str = "(우선)"
print("raw_str is_stop_word -> {}".format(is_stop_word(raw_str)))
normalized_str = strip_outer_punct(raw_str)
print("normalized_str is_stop_word -> {}".format(is_stop_word(normalized_str)))
Other Packages
If you need a Korean sentence tokenizer, please see https://github.com/Rairye/kr-sentence
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ko_ww_stopwords-0.0.1.tar.gz
.
File metadata
- Download URL: ko_ww_stopwords-0.0.1.tar.gz
- Upload date:
- Size: 3.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.7.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fe1752cac3184a121f2cd6cb0414838c9ecc8c8ad7596c6bfffd84732a6cedea |
|
MD5 | 65334ca10f0114773bcf7f6f843222a7 |
|
BLAKE2b-256 | 17a5b9a3fec799004d4b82bbd20c7f0583bbf4cc37b70cc3d1b8a4f4c1ac9f32 |
File details
Details for the file ko_ww_stopwords-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: ko_ww_stopwords-0.0.1-py3-none-any.whl
- Upload date:
- Size: 4.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.7.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 88883ee851b6c3be7468241b0fcf0bc0fe327d003b73a809710be04f1193627e |
|
MD5 | fe339f0b34368636393e0c98b92650f1 |
|
BLAKE2b-256 | f3d7f966ec69731af9c16c5becde0b41ea2d064aa35f084910552f1a0f73ec6d |