Concatenated-word segmentation Python library written in Rust
Project description
Concatenated-word segmentation Python library written in Rust
Table of Contents
About The Project
A fast concatenated-word segmentation library written in Rust, inspired by wordninja and wordsegment. The binding uses pyo3 to interact with the rust package.
Built With
Installation
pip3 install pywordsegment
Usage
import pywordsegment
# The internal UNIGRAMS & BIGRAMS corpuses are lazy initialized
# once per the whole module. Multiple WordSegmenter instances would
# not create new dictionaries.
# Segments a word to its parts
pywordsegment.WordSegmenter.segment(
text="theusashops",
)
# ["the", "usa", "shops"]
# This function checks whether the substring exists as a whole segment
# inside text.
pywordsegment.WordSegmenter.exist_as_segment(
substring="inter",
text="internationalairport",
)
# False
pywordsegment.WordSegmenter.exist_as_segment(
substring="inter",
text="intermilan",
)
# True
License
Distributed under the MIT License. See LICENSE
for more information.
Contact
Gal Ben David - gal@intsights.com
Project Link: https://github.com/intsights/pywordsegment
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
File details
Details for the file pywordsegment-0.4.3-cp311-none-win_amd64.whl
.
File metadata
- Download URL: pywordsegment-0.4.3-cp311-none-win_amd64.whl
- Upload date:
- Size: 9.6 MB
- Tags: CPython 3.11, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/0.12.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 95136eb3101f484faeb762d353036531f6124c8f8bec17d8109f02b1d3516594 |
|
MD5 | 0547613c35e38e4da6c38694de4dfdab |
|
BLAKE2b-256 | 1e129f3f6f282f272a3261df6f73e7d1a52a5025b383ed288b9cbef3f37ac7ee |
File details
Details for the file pywordsegment-0.4.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: pywordsegment-0.4.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 9.7 MB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/0.12.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1505fa8db755cac4e7bc103cc9d13cc48a7e9585a65d54447067bd3c3e23d240 |
|
MD5 | fa9924c9b48d32587491d31ad8b13970 |
|
BLAKE2b-256 | 55a8ca3c4b614defc62d65808e7c604f8516f321a1a0805b3b712abc80d7bcd6 |
File details
Details for the file pywordsegment-0.4.3-cp311-cp311-macosx_11_0_arm64.whl
.
File metadata
- Download URL: pywordsegment-0.4.3-cp311-cp311-macosx_11_0_arm64.whl
- Upload date:
- Size: 9.6 MB
- Tags: CPython 3.11, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/0.12.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 52e85ce00c09a5bbc7eb4a375ac8a54d99e30abd7e6f1b110277ce1a4de52952 |
|
MD5 | 7f31cd94a48de128b30cdf88105f6374 |
|
BLAKE2b-256 | 0005c0287f48ca6f484ffe8f718efe804858041aac2eeb5a03e6b267f4521703 |
File details
Details for the file pywordsegment-0.4.3-cp311-cp311-macosx_10_7_x86_64.whl
.
File metadata
- Download URL: pywordsegment-0.4.3-cp311-cp311-macosx_10_7_x86_64.whl
- Upload date:
- Size: 9.7 MB
- Tags: CPython 3.11, macOS 10.7+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/0.12.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a12be14c5fe9cf60b0cb37187a2b4ab93164c6096cbe522bc25be83fce59d6f2 |
|
MD5 | c993ae922fa66b17aa9b048bd1fbcee8 |
|
BLAKE2b-256 | f62db0daaa48ad73fd28c7b7c69b0251000b717f1984ec29b44d506981e95e52 |
File details
Details for the file pywordsegment-0.4.3-cp310-none-win_amd64.whl
.
File metadata
- Download URL: pywordsegment-0.4.3-cp310-none-win_amd64.whl
- Upload date:
- Size: 9.6 MB
- Tags: CPython 3.10, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/0.12.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 62dadee684ab40090bbdf96dd548c6e1b8e17254360d23494bd7e68b56c37eac |
|
MD5 | 6b4d94f0e224324beb41d70c4495fb0e |
|
BLAKE2b-256 | 76642205fb5307ecaf0f42b0b10e4fe6e0ccfed1bc3183eb2c24b1f8390f06cf |
File details
Details for the file pywordsegment-0.4.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: pywordsegment-0.4.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 9.7 MB
- Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/0.12.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4a013b1b1056f8f9124481ef18c608b12045115fe512e8eab736f075476d7fa5 |
|
MD5 | 57ca8cd623cfcad2a41e86fea5a13e77 |
|
BLAKE2b-256 | 5e14a9f0db6cafb559e95745d46388975660265a31a471dbb8e89c48b1d7130f |
File details
Details for the file pywordsegment-0.4.3-cp310-cp310-macosx_11_0_arm64.whl
.
File metadata
- Download URL: pywordsegment-0.4.3-cp310-cp310-macosx_11_0_arm64.whl
- Upload date:
- Size: 9.6 MB
- Tags: CPython 3.10, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/0.12.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 64b334a66b915bc386c93f50f1be9d39406d59aa9736128f30cec958fe4ac736 |
|
MD5 | 5b6d2ccd91ee45ef5051d861e6900006 |
|
BLAKE2b-256 | 7c3534c5308b6849651ea5c1e79f5ff506fe420bdd0f48e3a3ce7013ff2ffbe6 |
File details
Details for the file pywordsegment-0.4.3-cp310-cp310-macosx_10_7_x86_64.whl
.
File metadata
- Download URL: pywordsegment-0.4.3-cp310-cp310-macosx_10_7_x86_64.whl
- Upload date:
- Size: 9.7 MB
- Tags: CPython 3.10, macOS 10.7+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/0.12.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 82cb6bdd41379cd557bbd72115424c3bfd8d42e90b644de47ac6156e730f0cb5 |
|
MD5 | 8f9e8e6304712989545d38e986c6e2a4 |
|
BLAKE2b-256 | ba40969e9d7d424ede6986a85ecbee72820039596656f189ec00da8bfa86f18a |
File details
Details for the file pywordsegment-0.4.3-cp39-none-win_amd64.whl
.
File metadata
- Download URL: pywordsegment-0.4.3-cp39-none-win_amd64.whl
- Upload date:
- Size: 9.6 MB
- Tags: CPython 3.9, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/0.12.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b4c4fedbda9e15e456ac0d1553fc28dad5573de44a7b32847a2d362a1253c9cd |
|
MD5 | 3f89c082ed1f0dfb91347c55f91aa688 |
|
BLAKE2b-256 | 81cbf0426a59feff1e654321ea5276531212ecad73d47562dfbf098554a52f43 |
File details
Details for the file pywordsegment-0.4.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: pywordsegment-0.4.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 9.7 MB
- Tags: CPython 3.9, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/0.12.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e2b3698bb72b6882044c5e23aedb2b6cbf28294a26a4670afb543c2e6516f629 |
|
MD5 | 5be38faa151c65811c179a8028a2af08 |
|
BLAKE2b-256 | aa5674610afff5ab01cda688ee4f2abb7e8c9a3ec0ddec5e478fe3d4f26cfcc5 |
File details
Details for the file pywordsegment-0.4.3-cp39-cp39-macosx_11_0_arm64.whl
.
File metadata
- Download URL: pywordsegment-0.4.3-cp39-cp39-macosx_11_0_arm64.whl
- Upload date:
- Size: 9.6 MB
- Tags: CPython 3.9, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/0.12.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b3d13c22732d952a90e0d1c5200aca7bf81aa1036a9abe2bb0cf20435c26b170 |
|
MD5 | 4b7a77d609ad510e1cc301b957f21022 |
|
BLAKE2b-256 | 38d0797a68001ce6ccff44d94da891792280c5a81fa46177a28b7f1a566b1191 |
File details
Details for the file pywordsegment-0.4.3-cp39-cp39-macosx_10_7_x86_64.whl
.
File metadata
- Download URL: pywordsegment-0.4.3-cp39-cp39-macosx_10_7_x86_64.whl
- Upload date:
- Size: 9.7 MB
- Tags: CPython 3.9, macOS 10.7+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/0.12.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b044a381aa0826798a204dab0477347c813feeb56a3a496d9d7242d5094839a2 |
|
MD5 | 464d54fbd9443e924ad0908489f5d2ca |
|
BLAKE2b-256 | c7bd3c73e876971bb5310560cd1725333045109149736f6a87f42dec3813b5c9 |
File details
Details for the file pywordsegment-0.4.3-cp38-none-win_amd64.whl
.
File metadata
- Download URL: pywordsegment-0.4.3-cp38-none-win_amd64.whl
- Upload date:
- Size: 9.6 MB
- Tags: CPython 3.8, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/0.12.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8b686e43fa390cd9ff7ccabd1bd100217bf713c59f73f40e9f896e4826f227d9 |
|
MD5 | 60901da75d62ef7e84afe5ce1048c00c |
|
BLAKE2b-256 | 8d54ab63e12e2c2e03acb93c9261b152209e0f2f4641f123404aa0266dcc3e7f |
File details
Details for the file pywordsegment-0.4.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: pywordsegment-0.4.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 9.7 MB
- Tags: CPython 3.8, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/0.12.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 777ecc405a68b97ef4c33e14f37a03c405491a65cc927d81f55dbeef21adc83d |
|
MD5 | d0b3301eb37d5274a00a18ffff6c9cf1 |
|
BLAKE2b-256 | 6accb3317463991756ba2d38e7fdff56d527ff826559fc9c3997ae3431975ffa |
File details
Details for the file pywordsegment-0.4.3-cp38-cp38-macosx_11_0_arm64.whl
.
File metadata
- Download URL: pywordsegment-0.4.3-cp38-cp38-macosx_11_0_arm64.whl
- Upload date:
- Size: 9.6 MB
- Tags: CPython 3.8, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/0.12.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e80447c683501eee5e612990c8ad26329e2997adf65f85a6a07acb0a3c8c940d |
|
MD5 | 3714277544dba80207ae4a51a4454848 |
|
BLAKE2b-256 | 0b8296bd1d6006a7b2f10ce558175d424e87d3504f265f17679408bf36c5ae20 |
File details
Details for the file pywordsegment-0.4.3-cp38-cp38-macosx_10_7_x86_64.whl
.
File metadata
- Download URL: pywordsegment-0.4.3-cp38-cp38-macosx_10_7_x86_64.whl
- Upload date:
- Size: 9.7 MB
- Tags: CPython 3.8, macOS 10.7+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/0.12.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 423b4b19c512432c3f6dd469fa56720ab5e612f04289eb0b172f40ed71127f5b |
|
MD5 | bcbb76db979bb6279515801c62f62b58 |
|
BLAKE2b-256 | d1dc7c4fffe87c3d3adbfbf9d9dbc97a8c41437307643dc6eb2e6fee72053222 |
File details
Details for the file pywordsegment-0.4.3-cp37-none-win_amd64.whl
.
File metadata
- Download URL: pywordsegment-0.4.3-cp37-none-win_amd64.whl
- Upload date:
- Size: 9.6 MB
- Tags: CPython 3.7, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/0.12.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cfa950833fdee4b9be9afdda2e542f3cba727eab80c101e14cc6101df8213da7 |
|
MD5 | f7e20c7842cdc29f3d6b3c177922f81a |
|
BLAKE2b-256 | ca2bd34c3f8e67e4b8641a1f6da87b78e6ffa5f97561ae9003763c5849505855 |
File details
Details for the file pywordsegment-0.4.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: pywordsegment-0.4.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 9.7 MB
- Tags: CPython 3.7m, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/0.12.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e5d07c0845d71faa602872c3e84deb27b625b2811aa409ed8a1fa6fcfe0339b8 |
|
MD5 | 84a0a4a4d4a4e724688c11b2f4739446 |
|
BLAKE2b-256 | 134cf5abefefa0f810300c736b95122776029bd85d77648f22b2fa99fd0be13e |
File details
Details for the file pywordsegment-0.4.3-cp37-cp37m-macosx_11_0_arm64.whl
.
File metadata
- Download URL: pywordsegment-0.4.3-cp37-cp37m-macosx_11_0_arm64.whl
- Upload date:
- Size: 9.6 MB
- Tags: CPython 3.7m, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/0.12.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4765d31d4c7c4ea294fc51ca918995204fbb175e05864bda884ead247d338d0f |
|
MD5 | 788a784b6acd7289c05ad9aae0072241 |
|
BLAKE2b-256 | 7e6fc84123f7a8c17cbd770d0e9754624e1f86cec4bed1182d637a80b6ece571 |
File details
Details for the file pywordsegment-0.4.3-cp37-cp37m-macosx_10_7_x86_64.whl
.
File metadata
- Download URL: pywordsegment-0.4.3-cp37-cp37m-macosx_10_7_x86_64.whl
- Upload date:
- Size: 9.7 MB
- Tags: CPython 3.7m, macOS 10.7+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/0.12.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 99f53e061aed13282c2db07b460230987070c01d3da743221c32355748a47f39 |
|
MD5 | bcc552c08a3858babf0df463dff4b0cc |
|
BLAKE2b-256 | 203c7173e57883fc7587ba9e35875fbedf68df0c2320d5eb0aef95f3b5e24a8c |