Skip to main content

Text utils in Python.

Project description

naive-text

Text utils in Python.

Installation

pip install -U naive_text

Usage

from naive_text import TextNormalizer

tn = TextNormalizer()
text = '當轉換有兩個以上的字詞可能時,程式只會使用第一個。'
print(tn.normalize(txt, to_simplified=True, to_full_width=True, to_full_width_chars=[',']))

Outputs:

当转换有两个以上的字词可能时,程式只会使用第一个。
from naive_text import SentenceSegmenter

s = SentenceSegmenter()
paragraph = [
    "史卡肯表示:「我今天打的和当初在温布登打的一样, 除了这一次幸运之神落在我这边以外。",
    "他说:「其实在温布登时最后的胜利也有可能属于我,因为当时打到了第五盘却仍然僵持在二十比十八的对峙。",
    "菲利普西斯在当初的温布登比赛中,在面对史卡肯时曾经发出四十四个爱司球,但是为他搏得「重炮手」美誉的发球,并没有在今天的球赛中助他一臂之力。",
    "菲利普西斯在第一盘第七局以三十比四十落后时,竟然击出双发失误;另外在第九局他又再度犯下双发失误球,让史卡肯得以坐拥两次的破发点,并且顺利赢得第一盘。",
    "在这场历时六十六分钟的比赛里,史卡肯表示:「我大力主攻他的第二发球,同时我也对他的第一发球施压,使我取得更多的机会。」",
    "这也是史卡肯和菲利普西斯在六度对峙中的第二次获胜。"]
paragraph = ''.join(paragraph)

for idx, sent in s.cut(paragraph):
    print('No.{} sentence: {}'.format(idx, sent))

Outputs:

No.1 sentence: 他说:「其实在温布登时最后的胜利也有可能属于我,因为当时打到了第五盘却仍然僵持在二十比十八的对峙。
No.2 sentence: 菲利普西斯在当初的温布登比赛中,在面对史卡肯时曾经发出四十四个爱司球,但是为他搏得「重炮手」美誉的发球,并没有在今天的球赛中助他一臂之力。
No.3 sentence: 菲利普西斯在第一盘第七局以三十比四十落后时,竟然击出双发失误;
No.4 sentence: 另外在第九局他又再度犯下双发失误球,让史卡肯得以坐拥两次的破发点,并且顺利赢得第一盘。
No.5 sentence: 在这场历时六十六分钟的比赛里,史卡肯表示:「我大力主攻他的第二发球,同时我也对他的第一发球施压,使我取得更多的机会。」
No.6 sentence: 这也是史卡肯和菲利普西斯在六度对峙中的第二次获胜。
No.7 sentence: 

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

naive_text-0.0.1.tar.gz (3.7 kB view details)

Uploaded Source

Built Distribution

naive_text-0.0.1-py3-none-any.whl (8.1 kB view details)

Uploaded Python 3

File details

Details for the file naive_text-0.0.1.tar.gz.

File metadata

  • Download URL: naive_text-0.0.1.tar.gz
  • Upload date:
  • Size: 3.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.9.1

File hashes

Hashes for naive_text-0.0.1.tar.gz
Algorithm Hash digest
SHA256 8691ccf9e9a7012a76d25ef8e9d8de513e3e04e6bb9769559decfed700b97418
MD5 f9d73052ff92a0a0bc3561e6ad5885e9
BLAKE2b-256 2fd228b9fe75b3bd104471416c4f45724135298e397f93bb799e7612eb692549

See more details on using hashes here.

File details

Details for the file naive_text-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: naive_text-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 8.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.9.1

File hashes

Hashes for naive_text-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4b8543bbc7b74ce36a69dfc0fb2b4731d358b4ab7b3d0f342fe15795106667f3
MD5 bd037b7e82900ae622b2d8acd4187e35
BLAKE2b-256 0ed312611f09c89855f563c8bfe101d0c0cd001b59cfa75f0add6c1086eb7cf2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page