This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is a regular-expression based, extensible, and advanced tokeniser written in C++ (https://languagemachines.github.io/ucto).
Project description
The author of this package has not provided a project description
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
python-ucto-0.6.5.tar.gz
(79.0 kB
view hashes)
Built Distributions
Close
Hashes for python_ucto-0.6.5-cp311-cp311-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d7a06001ac59e72f3ae52fbffacc741ef70052864be2da7c90056af9f1b7d3b2 |
|
MD5 | 06811d2c59f00db4d0d2aca3a5eb7f8b |
|
BLAKE2b-256 | 595cb14ff1529735772ed665d5cc7e7adb24445bc1c40023b247a7bdc252c299 |
Close
Hashes for python_ucto-0.6.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4ddb9f16a3f8722498a5f45970108377128c49ad338335bd4b43dbc2024cf63d |
|
MD5 | d83c68b241a56d2c6a95314fb67d7843 |
|
BLAKE2b-256 | a589c64e34260fd958000c7ab18af18d09d8595f17acd6ad145ebab2eba3c6a3 |
Close
Hashes for python_ucto-0.6.5-cp311-cp311-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | adb3c1573feaf15c022f46333f2265d77510efece40197136d05ce1d1088373a |
|
MD5 | e394cd04ec68536a6becadb68dde3e48 |
|
BLAKE2b-256 | 82ced16a1f54fc509e9ad62d4fa0fdd746036f385f65108310b9c30921609822 |
Close
Hashes for python_ucto-0.6.5-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c64750cbe883fb614a5e82906264b366ca5551571b93457e1fcabdf64b1d5cb9 |
|
MD5 | e300ec816b94d81d75419a5b13bf74dc |
|
BLAKE2b-256 | c44533b62088b65e02d712c165a7ef6531541107bb3df67b60895075325f5ca4 |
Close
Hashes for python_ucto-0.6.5-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d2240794863007abe39ceb92c726bf6af45122fb2d85a080b604dccfe6d4ac6f |
|
MD5 | 844dc8d543bb836bf8db5b61941f0539 |
|
BLAKE2b-256 | 5353d45cfa1fcf1eef09e5c2c93a8ce5f145b8a4a134ebe242f2786520934cb2 |
Close
Hashes for python_ucto-0.6.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 140ea562e3cbd83f095b67e3ac567cadf8984205ce9aaf3495c7133590ef61d5 |
|
MD5 | 573e6ef25b3d7e014ce970908d18939b |
|
BLAKE2b-256 | f4289ea2c657d81bd3d0b94a6853ff8376e0596020253c5ed68a0dbb1871dc55 |
Close
Hashes for python_ucto-0.6.5-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fae60230f9e836e511eefe14b2870464f5b072e463fbbb547687d2bceafa0b89 |
|
MD5 | 5710eedc8e3a0b907435ad1f309f8aac |
|
BLAKE2b-256 | 7741f15a72ed7bc01fa7164999c4d671222ac5001a3a7cf52efbe82a8866cba0 |
Close
Hashes for python_ucto-0.6.5-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 25aaf92dc9b9eeaad48bc477c609ade4140a7e8457091c1aae59220967155fc5 |
|
MD5 | ef99f3f01576a247839d338c380d3c6c |
|
BLAKE2b-256 | 48844512a9b80c50c5d26b79336e288cea24856e4e7acbe4946cce2e8d970720 |
Close
Hashes for python_ucto-0.6.5-cp39-cp39-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a8647650a1e3253296571d4379b23ea8508fe183919ef7e87ac079eb75a38f2c |
|
MD5 | b776e04bf3a07656f2b08ca3f7666e32 |
|
BLAKE2b-256 | 3f5d1a921e6232846e84b48c7a675d11982b22202b51352abeb8d726eee26afa |
Close
Hashes for python_ucto-0.6.5-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b81b8fc0433a2a75ef540aba50859406f38d03a6df72663a0124f1682993c7cf |
|
MD5 | a465a31ecdbba6abc61ec59a2db9aa23 |
|
BLAKE2b-256 | 28d98f8ae47248c7173ee2abbdaea6880f899d75c6b371a01fb0c7e89f7b26e9 |
Close
Hashes for python_ucto-0.6.5-cp38-cp38-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9dc9479139be58568dd170ea735c5da9a357337324499deb7d3e824fa17d50bb |
|
MD5 | af0babf209b8754df11c3668804582ff |
|
BLAKE2b-256 | d1704737d7a2ae72528191ee4894d1962bc121e7d2b56706ec3b0597cca89b3e |
Close
Hashes for python_ucto-0.6.5-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3d0ce806f3257282fb27cbe2c8b648157d5c6292a60b9ec5c8d575f146232215 |
|
MD5 | 3343ee0cce9326859de5521755cdc879 |
|
BLAKE2b-256 | 936f9f071be85bed9e6a603a04552f6e110cdd2cadacecb8c2138407e2b66156 |
Close
Hashes for python_ucto-0.6.5-cp38-cp38-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 63026458725d36beebc04f85a6e19858717358a51309e1e77a35f35292e09563 |
|
MD5 | cc5bebcdac2c3d3939a2ea6246852d45 |
|
BLAKE2b-256 | ab83bdd54c487646210229009226be91cbe0c8cc9d49ac33595e4f41f4b2749a |
Close
Hashes for python_ucto-0.6.5-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c69c0d8be6b5a418072c8d57bb2665eac03e7d628e286f87bc017b4b23689b28 |
|
MD5 | 8624b1be874fb5d96219457ff5235378 |
|
BLAKE2b-256 | e293d88bab7ee7dbaa849aab79d1263442e92b8e842111632a86416587b3bd10 |
Close
Hashes for python_ucto-0.6.5-cp37-cp37m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5cd30747dbaa2d912c8e88377e90d84e86fe18ad2c549fc9fb9795e30b01b33a |
|
MD5 | c368378089347454fa5169005bf0c71c |
|
BLAKE2b-256 | 7136816178cd2a038e1fede234ddbaaa74fad5790a3e00e6cf6891884f28b352 |
Close
Hashes for python_ucto-0.6.5-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2278ad7865205881d5d6aafb7ecdb2f02945cbfe98e3d86d93daa1a73ec8ba95 |
|
MD5 | b3fc47073ee655c248a4ce8f74039b0e |
|
BLAKE2b-256 | fb2e826764a39fff80259eb9b348a45ac60990912595e54e90628d5bac1e2070 |
Close
Hashes for python_ucto-0.6.5-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a3b48b9d890392764c8d800586b9d956d344a895d5b5851054da69ecf9dd1870 |
|
MD5 | 1309efa92e180ee4b49c83050b4fda05 |
|
BLAKE2b-256 | 5f3175bc1054837e931753d6ddc2023af0470ff46455ce5bfe29dafce4bfe009 |
Close
Hashes for python_ucto-0.6.5-cp36-cp36m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9169c49c8edcc5fbf6640f6d8e239110f4c5ffe749af8855dcbbd3e0c70ecb53 |
|
MD5 | 06916b017900b31ba87603671fb139c0 |
|
BLAKE2b-256 | cdfdbe4b689ad6098a5c83a362f6d753fdf32d7cec57d781b286c8f220389f94 |
Close
Hashes for python_ucto-0.6.5-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ad8a1a8283c39b4a8619d3525558293248a90ce91e40c28c0f72f800d8c6102a |
|
MD5 | 67b4dc3ae38ef7cf5a25ca69531b0689 |
|
BLAKE2b-256 | fe28e19d67e6be9d96a38d52240957bd98f0ae43412d1dc16512d58c5304728c |
Close
Hashes for python_ucto-0.6.5-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1625a45dd654241b495a43336202334d4215450e3472ab24f05e5a201b866d6d |
|
MD5 | ea646741bd3cb30f110526f277c52186 |
|
BLAKE2b-256 | 6ff02e0314a2586941490823f1dc224c35d1afaaa29beffe7008a5ff11938378 |