Korean tokenizer, sentence classification, and spacing model.
Project description
Korean Tokenizer
References
- https://github.com/hyunwoongko/kss
- https://github.com/likejazz/korean-sentence-splitter
- https://github.com/open-korean-text/open-korean-text
- https://github.com/jeongukjae/korean-spacing-model
- https://littlefoxdiary.tistory.com/42
- https://github.com/bab2min/kiwipiepy
- http://semantics.kr/%ED%95%9C%EA%B5%AD%EC%96%B4-%ED%98%95%ED%83%9C%EC%86%8C-%EB%B6%84%EC%84%9D%EA%B8%B0-%EB%B3%84-%EB%AC%B8%EC%9E%A5-%EB%B6%84%EB%A6%AC-%EC%84%B1%EB%8A%A5%EB%B9%84%EA%B5%90/
- https://bab2min.tistory.com/669
- https://github.com/songys/AwesomeKorean_Data
- https://github.com/kakao/khaiii/wiki/CNN-%EB%AA%A8%EB%8D%B8
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
kotokenizer-0.1.0.tar.gz
(6.1 kB
view details)
Built Distribution
File details
Details for the file kotokenizer-0.1.0.tar.gz
.
File metadata
- Download URL: kotokenizer-0.1.0.tar.gz
- Upload date:
- Size: 6.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6ccaecfc832d469122e5694fa61c2566732bf6482f295db79f97fa9dc158e47c |
|
MD5 | 05d548b714b0dd0deb5e2b499aa6ab4a |
|
BLAKE2b-256 | cac5888c58c902f53057d6e50abaff2cdbe534f9f0068f2d8936c75c52f43141 |
File details
Details for the file kotokenizer-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: kotokenizer-0.1.0-py3-none-any.whl
- Upload date:
- Size: 6.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 74d1764e8c56c68d7524b879782130c9be45f9e17ed10eccbe6e7746b22e6457 |
|
MD5 | 115ba84c6a9de74fe169f554a39ac630 |
|
BLAKE2b-256 | 7bccb0207cc6e093f039af979fd01d2dc0826bacf340225760179eb92f5c28ed |