Skip to main content

pack_name descr

Project description

seg-text

pytestpythonCode style: blackLicense: MITPyPI version

Segment multilingual text to sentences

Currently for Python 3.8 only because of the package vtext used.

Pre-install fastetext whl for Windows

seg-text depends on fastlid which in turn depends on fasttext. Installing fasttext requires a C++ compiler.

For Windows without a C++ compiler, readily available whl packages can be downloaded from https://www.lfd.uci.edu/~gohlke/pythonlibs/ and installed (for example for python 3.8 amd64) as follows

pip install fasttext-0.9.2-cp38-cp38-win_amd64.whl

Install seg-text

pip install seg-text
# or pip install git+https://github.com/ffreemt/seg-text
# or poetry add git+https://github.com/ffreemt/seg-text

Use seg-text

from seg_text import seg_text

prin(seg_text(" text 1\n test 2. Test 3"))
# ["text 1", "test 2.", "Test 3"]

text = """ “元宇宙”,英文為“Metaverse”。該詞出自1992年;的科幻小說《雪崩》。 """
print(seg_text(text))
# ["“元宇宙”,英文為“Metaverse”。", "該詞出自1992年;的科幻小說《雪崩》。"]

# [;:] is a regex expression meaning either ; or :
# if you use ;: (without []), it would mean ;: together as a whole

print(seg_text(text, extra="[;:]"))
# ["“元宇宙”,英文為“Metaverse”。", "該詞出自1992年;", "的科幻小說《雪崩》。"]

Refer to seg_text.py for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seg_text-0.1.2.tar.gz (4.9 kB view details)

Uploaded Source

Built Distribution

seg_text-0.1.2-py3-none-any.whl (5.3 kB view details)

Uploaded Python 3

File details

Details for the file seg_text-0.1.2.tar.gz.

File metadata

  • Download URL: seg_text-0.1.2.tar.gz
  • Upload date:
  • Size: 4.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.8.5 Windows/10

File hashes

Hashes for seg_text-0.1.2.tar.gz
Algorithm Hash digest
SHA256 37332d6fa755659aba3d93faa1248242c56a5a84e1f9332990802226ad9c4ca8
MD5 a12708adee5cd85f745bc7f621a98c3f
BLAKE2b-256 15f8e2f034ef9fdc67d368bd28c5b3764d8b0f90f284ccb3c56252a87135f2a2

See more details on using hashes here.

File details

Details for the file seg_text-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: seg_text-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 5.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.8.5 Windows/10

File hashes

Hashes for seg_text-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9e67af219b81259d916a11708799ef52bb7d765f9d8010028dd8d48e053eda17
MD5 b5ef35ce0dea344ea84eb601cfe78c60
BLAKE2b-256 b81a79347846f465e44efb916cab87999e7f9179ba4515578835d8f4eeadc2be

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page