Skip to main content

No project description provided

Project description

PyPI version License: MIT Downloads LinkedIn

Tokeniser

Tokeniser is a lightweight Python package designed for simple and efficient token counting in text. It uses regular expressions to identify tokens, providing a straightforward approach to tokenization without relying on complex NLP models.

Installation

To install Tokeniser, you can use pip:

pip install tokeniser

Usage

Tokeniser is easy to use in your Python scripts. Here's a basic example:

import tokeniser

text = "Hello, World!"
token_count = tokeniser.estimate_tokens(text)
print(f"Number of tokens: {token_count}")

This package is ideal for scenarios where a simple token count is needed, without the overhead of more complex NLP tools.

Features

  • Simple and efficient token counting using regular expressions.
  • Lightweight with no dependencies on large NLP models or frameworks.
  • Versatile for use in various text processing tasks.

Contributing

Contributions, issues, and feature requests are welcome! Feel free to check the issues page.

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokeniser-2025.5.190811.tar.gz (2.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tokeniser-2025.5.190811-py3-none-any.whl (3.4 kB view details)

Uploaded Python 3

File details

Details for the file tokeniser-2025.5.190811.tar.gz.

File metadata

  • Download URL: tokeniser-2025.5.190811.tar.gz
  • Upload date:
  • Size: 2.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.11

File hashes

Hashes for tokeniser-2025.5.190811.tar.gz
Algorithm Hash digest
SHA256 0b0e43e95ea421135e4787de327c79a4b0023c4e1e0247bc0aa274f320b1c67f
MD5 b1b6fafea168ba5be1353a2d1de8476c
BLAKE2b-256 49f21c20f1fca1a970310794e7746ee5b8a824d302c4f4d6d55b6ac8a2e8b07d

See more details on using hashes here.

File details

Details for the file tokeniser-2025.5.190811-py3-none-any.whl.

File metadata

File hashes

Hashes for tokeniser-2025.5.190811-py3-none-any.whl
Algorithm Hash digest
SHA256 cfd85a9634282ac049ef12b6380868f208c19b87d4c1a68a60f868d7e1745b6d
MD5 199d0ff92c098d932e535554d24ae29d
BLAKE2b-256 93bff37cc689b528e455dab8327f9eedcfd4af8373f2e51ffaeadec79da901b5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page