This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is a regular-expression based, extensible, and advanced tokeniser written in C++ (https://languagemachines.github.io/ucto).
Project description
The author of this package has not provided a project description
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
python-ucto-0.6.0.tar.gz
(76.4 kB
view hashes)
Built Distributions
Close
Hashes for python_ucto-0.6.0-cp311-cp311-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a88ae8b66f80b794d2775c1ee42a9d1fc11e464c4666b92ca8b5b554c05cfd52 |
|
MD5 | 5d7cff732d766b6b916fd6a510f7a3ab |
|
BLAKE2b-256 | fcfd170ab0ce296ac7a2c6683571577df2cd822fafc2a838a47aff4858e4e116 |
Close
Hashes for python_ucto-0.6.0-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a876f80b4aaa8ee0ef5039ddc36535d097b294a1e058b9421738262ae61972af |
|
MD5 | e0f5c0edbced0c47d28376fbd139801e |
|
BLAKE2b-256 | 19decc73efd2e2cced29bf964d686902bbcb8eb490afeb467249495e9efdcab8 |
Close
Hashes for python_ucto-0.6.0-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 31228019f9fc0a38aa2497c765643fdaf394105886def51735f4146f2bacd9f1 |
|
MD5 | 17c2bc48893f826b34bdbaea7351926c |
|
BLAKE2b-256 | eca8941f93141522596fc3df7400c5307ffcf4e7e67b1716310ad26ae124831e |
Close
Hashes for python_ucto-0.6.0-cp39-cp39-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fed6c8c3d292911feb24ff786ebfdaec862889c430c9ca53409aa5464991a111 |
|
MD5 | fecbea9fe88fab149242dba117533ba0 |
|
BLAKE2b-256 | 9bff72997b3d96904e4ac728a5709fd785ae4c818e1502f2fb6a070e24226738 |
Close
Hashes for python_ucto-0.6.0-cp39-cp39-manylinux_2_28_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d3b64ea5125c654375f4ad2a86ea210ee3e5f206b3b867d7aa6a117b4703faf1 |
|
MD5 | bae02b0ffe436adc011bc5e5dfb0a4bc |
|
BLAKE2b-256 | fefe1c4687edbd872d466905f22f1ded27ff055933794aa0d317cbe1c61b85e9 |
Close
Hashes for python_ucto-0.6.0-cp38-cp38-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5f0604a45f09cc4a3288b69226209ddb6bcc3c6612f41f849ed5323feda53f57 |
|
MD5 | b2c7b7f18695bb168b9cc9b53f92e696 |
|
BLAKE2b-256 | a4a6ba572d4e77940ba08fa5d9ec16c7fbfc6c98197dffb746896a6b22c28f71 |
Close
Hashes for python_ucto-0.6.0-cp38-cp38-manylinux_2_28_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 431e959d58ed335d5fcb9c074724c28cdbf5bbd4893c93f58d6da4600fd48689 |
|
MD5 | 16e5240eeaf9d58f0fcea012976e0e2b |
|
BLAKE2b-256 | 140c412dfcc37f6d26eaf4a65ee09b2a85610b5ba64ce0094308a0f5c6a886e3 |
Close
Hashes for python_ucto-0.6.0-cp37-cp37m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cdc1046a5baac0811aac8a94d6da2f28cdf17f29004e3288e9470750e2b0ce10 |
|
MD5 | 0cb5c3397b9932f3d7d8eab114353ad1 |
|
BLAKE2b-256 | 77a7d3ed392a16bc99c9d0106e83e2ab3f14aaae242738735f451b2dad234e4e |
Close
Hashes for python_ucto-0.6.0-cp37-cp37m-manylinux_2_28_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ac5c181880f1dc101368f3f60984d16f74c0de0113974b7c45892cb724485f67 |
|
MD5 | 880057aa5839555ab0a54e7472d064a4 |
|
BLAKE2b-256 | 4b53bde1dd02ab0ba437f8cad4dfe84a9c71794eb0bca83237ed11607b0870eb |
Close
Hashes for python_ucto-0.6.0-cp36-cp36m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a69b9eed6921519109df2307389bb74f32a026454d79fa8710d4c8887899a092 |
|
MD5 | 95b76136b3d34d95f50b19271ecafa00 |
|
BLAKE2b-256 | 1e45f24c112e8f95369b393d2cbb0abd2867bc73a9e3ecfe9bc45f734aec5011 |
Close
Hashes for python_ucto-0.6.0-cp36-cp36m-manylinux_2_28_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e0ba2967971005d7b60cc5dc87beb59ae22fb5328cc28a05d19d8be2c17f08c0 |
|
MD5 | 8a0faad638bbaf4b7eb24ce03e9796ca |
|
BLAKE2b-256 | d4e7e8c75d64bd14ac29c4667e8917eeb71a7a571b8de694f552d56a97272d3f |