Skip to main content

Fast tokenizer

Project description

Tokenizer

Tokenizer different language fast.

Build a package

python setup.py bdist_wheel
twine upload dist/*

Use Locally

x1 = '<a>刘强东是一个著名企业家。</a> 他创建了京东。'
t = EncoderLoader.load_tokenizer('bert-base-chinese-zh_v4-10K')
print(t.tokenize(x1, mode='char'))
print(t.tokenize(x1, mode='word'))
print(t.tokenize(x1, mode='all'))

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

soco_tokenizer-1.1-py3-none-any.whl (12.8 kB view details)

Uploaded Python 3

File details

Details for the file soco_tokenizer-1.1-py3-none-any.whl.

File metadata

  • Download URL: soco_tokenizer-1.1-py3-none-any.whl
  • Upload date:
  • Size: 12.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.1 pkginfo/1.6.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.5

File hashes

Hashes for soco_tokenizer-1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8eff09f3c9c9bac2d8fdd333db31a041b821de9466e3b0512c53f99d4b1bb73d
MD5 c79000f9dfae702660c94e5fdc5816f4
BLAKE2b-256 eaa2aaa216c040b565f0296605b24292e185e49b08f542ae586dffa4065dd28f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page