Skip to main content

No project description provided

Project description

PyPI version License: MIT Downloads

Tokeniser

Tokeniser is a lightweight Python package designed for simple and efficient token counting in text. It uses regular expressions to identify tokens, providing a straightforward approach to tokenization without relying on complex NLP models.

Installation

To install Tokeniser, you can use pip:

pip install tokeniser

Usage

Tokeniser is easy to use in your Python scripts. Here's a basic example:

import tokeniser

text = "Hello, World!"
token_count = tokeniser.estimate_tokens(text)
print(f"Number of tokens: {token_count}")

This package is ideal for scenarios where a simple token count is needed, without the overhead of more complex NLP tools.

Features

  • Simple and efficient token counting using regular expressions.
  • Lightweight with no dependencies on large NLP models or frameworks.
  • Versatile for use in various text processing tasks.

Contributing

Contributions, issues, and feature requests are welcome! Feel free to check the issues page.

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokeniser-0.0.3.tar.gz (2.8 kB view details)

Uploaded Source

Built Distribution

tokeniser-0.0.3-py3-none-any.whl (3.2 kB view details)

Uploaded Python 3

File details

Details for the file tokeniser-0.0.3.tar.gz.

File metadata

  • Download URL: tokeniser-0.0.3.tar.gz
  • Upload date:
  • Size: 2.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.9

File hashes

Hashes for tokeniser-0.0.3.tar.gz
Algorithm Hash digest
SHA256 5d3160809f4ea9288b93aeff67fe0f22bccc63fd729173df591a5b8b65543c95
MD5 fdde3f89d5b3f6cb15fd5df9d17f9734
BLAKE2b-256 07ea1548d27059f09987588d2a9dcf3fe75a9f058e59215d07a257ebc5d4386f

See more details on using hashes here.

File details

Details for the file tokeniser-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: tokeniser-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 3.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.9

File hashes

Hashes for tokeniser-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 7940ab3b2a02b8b02307805c2cc53cf7c591fd9b106c963ad017349cf65330f0
MD5 f2ce4884c26af5da24f8c4cdaf6a006c
BLAKE2b-256 240d8777b5942fb608ac3b6a81427658d7336667e61656e927f62ad9ca800518

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page