Skip to main content

Concatenated-word segmentation Python library written in Rust

Project description

Logo

Concatenated-word segmentation Python library written in Rust

license Python OS Build PyPi

Table of Contents

About The Project

A fast concatenated-word segmentation library written in Rust, inspired by wordninja and wordsegment. The binding uses pyo3 to interact with the rust package.

Built With

Installation

pip3 install pywordsegment

Usage

import pywordsegment

# The internal UNIGRAMS & BIGRAMS corpuses are lazy initialized
# once per the whole module. Multiple WordSegmenter instances would
# not create new dictionaries.

# Segments a word to its parts
pywordsegment.WordSegmenter.segment(
    text="theusashops",
)
# ["the", "usa", "shops"]


# This function checks whether the substring exists as a whole segment
# inside text.
pywordsegment.WordSegmenter.exist_as_segment(
    substring="inter",
    text="internationalairport",
)
# False

pywordsegment.WordSegmenter.exist_as_segment(
    substring="inter",
    text="intermilan",
)
# True

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Gal Ben David - gal@intsights.com

Project Link: https://github.com/intsights/pywordsegment

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

pywordsegment-0.4.3-cp311-none-win_amd64.whl (9.6 MB view details)

Uploaded CPython 3.11 Windows x86-64

pywordsegment-0.4.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.7 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

pywordsegment-0.4.3-cp311-cp311-macosx_11_0_arm64.whl (9.6 MB view details)

Uploaded CPython 3.11 macOS 11.0+ ARM64

pywordsegment-0.4.3-cp311-cp311-macosx_10_7_x86_64.whl (9.7 MB view details)

Uploaded CPython 3.11 macOS 10.7+ x86-64

pywordsegment-0.4.3-cp310-none-win_amd64.whl (9.6 MB view details)

Uploaded CPython 3.10 Windows x86-64

pywordsegment-0.4.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.7 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

pywordsegment-0.4.3-cp310-cp310-macosx_11_0_arm64.whl (9.6 MB view details)

Uploaded CPython 3.10 macOS 11.0+ ARM64

pywordsegment-0.4.3-cp310-cp310-macosx_10_7_x86_64.whl (9.7 MB view details)

Uploaded CPython 3.10 macOS 10.7+ x86-64

pywordsegment-0.4.3-cp39-none-win_amd64.whl (9.6 MB view details)

Uploaded CPython 3.9 Windows x86-64

pywordsegment-0.4.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.7 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

pywordsegment-0.4.3-cp39-cp39-macosx_11_0_arm64.whl (9.6 MB view details)

Uploaded CPython 3.9 macOS 11.0+ ARM64

pywordsegment-0.4.3-cp39-cp39-macosx_10_7_x86_64.whl (9.7 MB view details)

Uploaded CPython 3.9 macOS 10.7+ x86-64

pywordsegment-0.4.3-cp38-none-win_amd64.whl (9.6 MB view details)

Uploaded CPython 3.8 Windows x86-64

pywordsegment-0.4.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.7 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

pywordsegment-0.4.3-cp38-cp38-macosx_11_0_arm64.whl (9.6 MB view details)

Uploaded CPython 3.8 macOS 11.0+ ARM64

pywordsegment-0.4.3-cp38-cp38-macosx_10_7_x86_64.whl (9.7 MB view details)

Uploaded CPython 3.8 macOS 10.7+ x86-64

pywordsegment-0.4.3-cp37-none-win_amd64.whl (9.6 MB view details)

Uploaded CPython 3.7 Windows x86-64

pywordsegment-0.4.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.7 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

pywordsegment-0.4.3-cp37-cp37m-macosx_11_0_arm64.whl (9.6 MB view details)

Uploaded CPython 3.7m macOS 11.0+ ARM64

pywordsegment-0.4.3-cp37-cp37m-macosx_10_7_x86_64.whl (9.7 MB view details)

Uploaded CPython 3.7m macOS 10.7+ x86-64

File details

Details for the file pywordsegment-0.4.3-cp311-none-win_amd64.whl.

File metadata

File hashes

Hashes for pywordsegment-0.4.3-cp311-none-win_amd64.whl
Algorithm Hash digest
SHA256 95136eb3101f484faeb762d353036531f6124c8f8bec17d8109f02b1d3516594
MD5 0547613c35e38e4da6c38694de4dfdab
BLAKE2b-256 1e129f3f6f282f272a3261df6f73e7d1a52a5025b383ed288b9cbef3f37ac7ee

See more details on using hashes here.

File details

Details for the file pywordsegment-0.4.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pywordsegment-0.4.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1505fa8db755cac4e7bc103cc9d13cc48a7e9585a65d54447067bd3c3e23d240
MD5 fa9924c9b48d32587491d31ad8b13970
BLAKE2b-256 55a8ca3c4b614defc62d65808e7c604f8516f321a1a0805b3b712abc80d7bcd6

See more details on using hashes here.

File details

Details for the file pywordsegment-0.4.3-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pywordsegment-0.4.3-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 52e85ce00c09a5bbc7eb4a375ac8a54d99e30abd7e6f1b110277ce1a4de52952
MD5 7f31cd94a48de128b30cdf88105f6374
BLAKE2b-256 0005c0287f48ca6f484ffe8f718efe804858041aac2eeb5a03e6b267f4521703

See more details on using hashes here.

File details

Details for the file pywordsegment-0.4.3-cp311-cp311-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for pywordsegment-0.4.3-cp311-cp311-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 a12be14c5fe9cf60b0cb37187a2b4ab93164c6096cbe522bc25be83fce59d6f2
MD5 c993ae922fa66b17aa9b048bd1fbcee8
BLAKE2b-256 f62db0daaa48ad73fd28c7b7c69b0251000b717f1984ec29b44d506981e95e52

See more details on using hashes here.

File details

Details for the file pywordsegment-0.4.3-cp310-none-win_amd64.whl.

File metadata

File hashes

Hashes for pywordsegment-0.4.3-cp310-none-win_amd64.whl
Algorithm Hash digest
SHA256 62dadee684ab40090bbdf96dd548c6e1b8e17254360d23494bd7e68b56c37eac
MD5 6b4d94f0e224324beb41d70c4495fb0e
BLAKE2b-256 76642205fb5307ecaf0f42b0b10e4fe6e0ccfed1bc3183eb2c24b1f8390f06cf

See more details on using hashes here.

File details

Details for the file pywordsegment-0.4.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pywordsegment-0.4.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4a013b1b1056f8f9124481ef18c608b12045115fe512e8eab736f075476d7fa5
MD5 57ca8cd623cfcad2a41e86fea5a13e77
BLAKE2b-256 5e14a9f0db6cafb559e95745d46388975660265a31a471dbb8e89c48b1d7130f

See more details on using hashes here.

File details

Details for the file pywordsegment-0.4.3-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pywordsegment-0.4.3-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 64b334a66b915bc386c93f50f1be9d39406d59aa9736128f30cec958fe4ac736
MD5 5b6d2ccd91ee45ef5051d861e6900006
BLAKE2b-256 7c3534c5308b6849651ea5c1e79f5ff506fe420bdd0f48e3a3ce7013ff2ffbe6

See more details on using hashes here.

File details

Details for the file pywordsegment-0.4.3-cp310-cp310-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for pywordsegment-0.4.3-cp310-cp310-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 82cb6bdd41379cd557bbd72115424c3bfd8d42e90b644de47ac6156e730f0cb5
MD5 8f9e8e6304712989545d38e986c6e2a4
BLAKE2b-256 ba40969e9d7d424ede6986a85ecbee72820039596656f189ec00da8bfa86f18a

See more details on using hashes here.

File details

Details for the file pywordsegment-0.4.3-cp39-none-win_amd64.whl.

File metadata

File hashes

Hashes for pywordsegment-0.4.3-cp39-none-win_amd64.whl
Algorithm Hash digest
SHA256 b4c4fedbda9e15e456ac0d1553fc28dad5573de44a7b32847a2d362a1253c9cd
MD5 3f89c082ed1f0dfb91347c55f91aa688
BLAKE2b-256 81cbf0426a59feff1e654321ea5276531212ecad73d47562dfbf098554a52f43

See more details on using hashes here.

File details

Details for the file pywordsegment-0.4.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pywordsegment-0.4.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e2b3698bb72b6882044c5e23aedb2b6cbf28294a26a4670afb543c2e6516f629
MD5 5be38faa151c65811c179a8028a2af08
BLAKE2b-256 aa5674610afff5ab01cda688ee4f2abb7e8c9a3ec0ddec5e478fe3d4f26cfcc5

See more details on using hashes here.

File details

Details for the file pywordsegment-0.4.3-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pywordsegment-0.4.3-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b3d13c22732d952a90e0d1c5200aca7bf81aa1036a9abe2bb0cf20435c26b170
MD5 4b7a77d609ad510e1cc301b957f21022
BLAKE2b-256 38d0797a68001ce6ccff44d94da891792280c5a81fa46177a28b7f1a566b1191

See more details on using hashes here.

File details

Details for the file pywordsegment-0.4.3-cp39-cp39-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for pywordsegment-0.4.3-cp39-cp39-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 b044a381aa0826798a204dab0477347c813feeb56a3a496d9d7242d5094839a2
MD5 464d54fbd9443e924ad0908489f5d2ca
BLAKE2b-256 c7bd3c73e876971bb5310560cd1725333045109149736f6a87f42dec3813b5c9

See more details on using hashes here.

File details

Details for the file pywordsegment-0.4.3-cp38-none-win_amd64.whl.

File metadata

File hashes

Hashes for pywordsegment-0.4.3-cp38-none-win_amd64.whl
Algorithm Hash digest
SHA256 8b686e43fa390cd9ff7ccabd1bd100217bf713c59f73f40e9f896e4826f227d9
MD5 60901da75d62ef7e84afe5ce1048c00c
BLAKE2b-256 8d54ab63e12e2c2e03acb93c9261b152209e0f2f4641f123404aa0266dcc3e7f

See more details on using hashes here.

File details

Details for the file pywordsegment-0.4.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pywordsegment-0.4.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 777ecc405a68b97ef4c33e14f37a03c405491a65cc927d81f55dbeef21adc83d
MD5 d0b3301eb37d5274a00a18ffff6c9cf1
BLAKE2b-256 6accb3317463991756ba2d38e7fdff56d527ff826559fc9c3997ae3431975ffa

See more details on using hashes here.

File details

Details for the file pywordsegment-0.4.3-cp38-cp38-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pywordsegment-0.4.3-cp38-cp38-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e80447c683501eee5e612990c8ad26329e2997adf65f85a6a07acb0a3c8c940d
MD5 3714277544dba80207ae4a51a4454848
BLAKE2b-256 0b8296bd1d6006a7b2f10ce558175d424e87d3504f265f17679408bf36c5ae20

See more details on using hashes here.

File details

Details for the file pywordsegment-0.4.3-cp38-cp38-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for pywordsegment-0.4.3-cp38-cp38-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 423b4b19c512432c3f6dd469fa56720ab5e612f04289eb0b172f40ed71127f5b
MD5 bcbb76db979bb6279515801c62f62b58
BLAKE2b-256 d1dc7c4fffe87c3d3adbfbf9d9dbc97a8c41437307643dc6eb2e6fee72053222

See more details on using hashes here.

File details

Details for the file pywordsegment-0.4.3-cp37-none-win_amd64.whl.

File metadata

File hashes

Hashes for pywordsegment-0.4.3-cp37-none-win_amd64.whl
Algorithm Hash digest
SHA256 cfa950833fdee4b9be9afdda2e542f3cba727eab80c101e14cc6101df8213da7
MD5 f7e20c7842cdc29f3d6b3c177922f81a
BLAKE2b-256 ca2bd34c3f8e67e4b8641a1f6da87b78e6ffa5f97561ae9003763c5849505855

See more details on using hashes here.

File details

Details for the file pywordsegment-0.4.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pywordsegment-0.4.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e5d07c0845d71faa602872c3e84deb27b625b2811aa409ed8a1fa6fcfe0339b8
MD5 84a0a4a4d4a4e724688c11b2f4739446
BLAKE2b-256 134cf5abefefa0f810300c736b95122776029bd85d77648f22b2fa99fd0be13e

See more details on using hashes here.

File details

Details for the file pywordsegment-0.4.3-cp37-cp37m-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pywordsegment-0.4.3-cp37-cp37m-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4765d31d4c7c4ea294fc51ca918995204fbb175e05864bda884ead247d338d0f
MD5 788a784b6acd7289c05ad9aae0072241
BLAKE2b-256 7e6fc84123f7a8c17cbd770d0e9754624e1f86cec4bed1182d637a80b6ece571

See more details on using hashes here.

File details

Details for the file pywordsegment-0.4.3-cp37-cp37m-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for pywordsegment-0.4.3-cp37-cp37m-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 99f53e061aed13282c2db07b460230987070c01d3da743221c32355748a47f39
MD5 bcc552c08a3858babf0df463dff4b0cc
BLAKE2b-256 203c7173e57883fc7587ba9e35875fbedf68df0c2320d5eb0aef95f3b5e24a8c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page