No project description provided
Project description
kuzukiri
A simple text segmenter
What's this?
This is a python library for text segmentation of Japanese text.
Features
- Text segmentation by simple rules,
- rule-based, no machine learning,
- so you can assume results.
- comparably fast. It's written in rust-lang.
Install
from PyPI
pip install kuzukiri
from source code
pip install setuptools-rust
python setup.py install
Usage
import kuzukiri
segmenter = kuzukiri.Segmenter()
text = "これはテストです。文分割します。"
sentences = segmenter.split(text)
print(sentences) # => ['これはテストです。', '文分割します。']
For details, see examples
and tests
directories.
License
MIT
Dependencies
- PyO3 : to compile rust code for python.
- unicode_normalization crate : for NFKC normalization
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
kuzukiri-0.1.3.tar.gz
(5.0 kB
view hashes)
Built Distributions
Close
Hashes for kuzukiri-0.1.3-cp310-cp310-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1d4c7bf5d7d20654bfa2522753ee3794d25139f42bc3a321182f713625024398 |
|
MD5 | 4d480d6c3429461ed4b6c812b5bff2ca |
|
BLAKE2b-256 | 7b3ab0ca1d5c0f649812d3f70f7f66cad3ca96f6e811fae1c979f5f53c2236b5 |
Close
Hashes for kuzukiri-0.1.3-cp310-cp310-macosx_12_0_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b022424f16e2d3115f7bbe946c02821b4432e1b44b12750f2395388448e64f3d |
|
MD5 | ad1dd8e949d81238b2c543723be3b190 |
|
BLAKE2b-256 | 1655bfcc54eef29d56f8430673e3fab41bacb3265302277dd3a1fcaafde2b899 |
Close
Hashes for kuzukiri-0.1.3-cp39-cp39-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3c7fc6a5b105ce7f82f4bff607bbe1dad847293223424c72906b51c67b70d603 |
|
MD5 | 479b1004a775b07f773d7e5e39d6d309 |
|
BLAKE2b-256 | 6865fdaad9f712d41511c84b942de7c86a181e93702ea68fa443a9d4b6a5a49b |
Close
Hashes for kuzukiri-0.1.3-cp39-cp39-macosx_12_0_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 70775db3cb275d0a29c539819ae2ac61c2db8f784a34a4b284ff9856a54ef6e9 |
|
MD5 | efc180c1f6ec595693244edafdd93e1e |
|
BLAKE2b-256 | 0164ca6153fc094753746aa58f7b85bf81e848ccd1c57a089054724136a50425 |
Close
Hashes for kuzukiri-0.1.3-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f2a129e38d4684a29b0eaf6d0e5601a363e460f7943a701dce619c26b38e470f |
|
MD5 | 8755d60cc21ecd903e44de516032870a |
|
BLAKE2b-256 | b047c10792e27c434e9ac36493d8d655e3232fa245585c42f0cbbf351244d06d |
Close
Hashes for kuzukiri-0.1.3-cp38-cp38-macosx_12_0_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e38bfda84ec090b89654308f86e27f57a50c924e6a468cfb2b16edd068b923f4 |
|
MD5 | e44a3b224a53068296882a8979441fb9 |
|
BLAKE2b-256 | d1ceaf5a65d31c6de709f3d2e70df052d7065ed0d8036472e6f3d1704b7bbe45 |
Close
Hashes for kuzukiri-0.1.3-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 61791df3efa53affeefcef54fbdbf48c193f4a7ce25aee6b8c965cab027febb9 |
|
MD5 | 745c307ff8e5e8db0f4a86bf224d841b |
|
BLAKE2b-256 | d3355a6031b8f8841d5fb4561f845cef17ce4ec270d76305ddf70700d7b4c2e4 |
Close
Hashes for kuzukiri-0.1.3-cp37-cp37m-macosx_12_0_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cb109853d3f6910bc9ea2ff6ee8585db4eade12f793eed994d932f64b8d5e6a2 |
|
MD5 | 8f5220c032c63fc97ef4d8ef86f13818 |
|
BLAKE2b-256 | 9081e2fe46787884dfc0f84e8432f4c8c3dabf593748da06c599049c6417adb3 |