Skip to main content

FlashBertTokenizer implementation with C++ backend

Project description

flash-tokenizer

Flash BERT tokenizer implementation with C++ backend.

Installation

pip install flash-tokenizer
git clone https://github.com/springkim/flash-tokenizer.git
cd flash-tokenizer
pip install .

Usage

from flash_tokenizer import FlashBertTokenizer
tokenizer = FlashBertTokenizer("path/to/vocab.txt", do_lower_case=True)
# Tokenize text
ids = tokenizer("Hello, world!")
print(ids)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flash_tokenizer-0.2.0.tar.gz (6.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flash_tokenizer-0.2.0-cp312-cp312-macosx_15_0_arm64.whl (72.2 kB view details)

Uploaded CPython 3.12macOS 15.0+ ARM64

File details

Details for the file flash_tokenizer-0.2.0.tar.gz.

File metadata

  • Download URL: flash_tokenizer-0.2.0.tar.gz
  • Upload date:
  • Size: 6.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for flash_tokenizer-0.2.0.tar.gz
Algorithm Hash digest
SHA256 31ac66ae64d81e61f53189c67f56872d515ba8a25eaf932cfd0413a2aa188324
MD5 7b53f9ef5c13b2e8de26c69917709d4f
BLAKE2b-256 38e2c79a9e661239b458986d9cadf13dfc5f7ffa78f1b306202018df061e86fc

See more details on using hashes here.

File details

Details for the file flash_tokenizer-0.2.0-cp312-cp312-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for flash_tokenizer-0.2.0-cp312-cp312-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 681c019979976ff01860f777a362cc065f75ad80f7b29aa67ba2a3e601c3103f
MD5 2d5a7c3c657dc3192ae3dcf0897f335c
BLAKE2b-256 3eb3dfc646a082eaa0d8627a334fdd2ed027305e1b5afc819a4634467afda343

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page