Skip to main content

Library for manipulating the existing tokenizer.

Project description

Tokenizer-Changer

Python script for manipulating the existing tokenizer.

The solution was tested on Llama3-8B tokenizer.

Installation

Installation from PyPI:

pip install tokenizerchanger

Requirements

  • Python 3.9+
  • tokenizers>=0.21.0
  • transformers>=4.47.0
  • tqdm>=4.66.4

Docs

https://tokenizer-changer.readthedocs.io/en/latest/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

TokenizerChanger-1.0.4.tar.gz (9.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

TokenizerChanger-1.0.4-py3-none-any.whl (9.7 kB view details)

Uploaded Python 3

File details

Details for the file TokenizerChanger-1.0.4.tar.gz.

File metadata

  • Download URL: TokenizerChanger-1.0.4.tar.gz
  • Upload date:
  • Size: 9.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.11

File hashes

Hashes for TokenizerChanger-1.0.4.tar.gz
Algorithm Hash digest
SHA256 ea222655bf2daee7259e177fa4493bb49c75c9299d17476607d053ed02a2c119
MD5 d63f13745124adb30d20267ad96088e9
BLAKE2b-256 bb84f62dc8ac41798d3963f62cd90a31a5b8760b3b7f24e7d7bd534d3f148d88

See more details on using hashes here.

File details

Details for the file TokenizerChanger-1.0.4-py3-none-any.whl.

File metadata

File hashes

Hashes for TokenizerChanger-1.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 0c4a5213ecbfd10da3618d934ac4548a3dbbe1722a6e2a3aa40a33931566a960
MD5 50d83b388fb7320377cb13bc07e47cbc
BLAKE2b-256 5f1601d32ec93cfa2af1502344eccdf39fb72b2ad04e0bd7d49785c071e5a1c6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page