KOrean Rpc-based Application for Handy Application for Language-processing
Project description
Korhal
Korhal(KOrean Rpc-based Handy Application for Language-processing) is a python wrapper for several korean Part-Of-Speech taggers.
How to install
pip install korhal
Available taggers
- KOMORAN with
korhal.komoran
- Hannanum with
korhal.hannanum
- Open-source Korean Text Processor with
korhal.openkoreantext
How to use
from korhal.komoran import tokenize
result = tokenize("집에 가서 잠을 자고 싶다")
# result => Token(text=집,pos=NNG), Token(text=에,pos=JKB), Token(text=가,pos=VV), Token(text=아서,pos=EC), Token(text=잠,pos=NNG), Token(text=을,pos=JKO), Token(text=자,pos=VV), Token(text=고,pos=EC), Token(text=싶,pos=VX), Token(text=다,pos=EC)]
print(result.text) # => 집
print(result.pos) # => NNG
nouns = [token.text for token in result if token.pos.startswith('N')]
Asynchronous methods
With korhal.aio
, you can use asynchronous methods. The performance of multi-core systems can be slightly improved when performing extensive processing.
from korhal.aio.opentextkorean import tokenize
texts = ['달디단 맛있는 케이크가 있었다', '솜사탕 같이 귀여운 구름']
futures = [tokenize(text) for text in texts]
results = [f.result() for f in futures]
Thanks to
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
korhal-0.1.2.tar.gz
(8.8 kB
view hashes)