Skip to main content

A package to count tokens in input text using OpenAI's tiktoken library.

Project description

gptwc: wc for GPT tokens

A simple utility for counting tokens. The wc utility counts words or characters. The gptwc utility functions similarly but counts tokens. Tokens are smaller than words but larger than characters.

Use gptwc to check the number of tokens in a string, in order to remain under the token limit (eg. 4097) for your large language model API. Uses tiktoken.

Installation

$ pip install gptwc

$ echo "Simple is better than complex." | gptwc
7

Example Usage

$ cat LICENSE  | gptwc
257
$ cat LICENSE | wc -c
1059
$ cat LICENSE | wc -w
165


$ curl -s 'https://gist.githubusercontent.com/phillipj/4944029/raw/75ba2243dd5ec2875f629bf5d79f6c1e4b5a8b46/alice_in_wonderland.txt' | wc -w
26470

curl -s 'https://gist.githubusercontent.com/phillipj/4944029/raw/75ba2243dd5ec2875f629bf5d79f6c1e4b5a8b46/alice_in_wonderland.txt' | gptwc
40085


$ cat LICENSE | gptwc --model text-davinci-003
257
$ cat LICENSE | gptwc --model gpt-3.5-turbo
201


$ cat README.md | pbcopy
$ gptwc -c
517

Options

usage: gptwc [-h] [--files0-from F] [--model MODEL] [-c] [--version] [FILE ...]

Count tokens in text files using OpenAI's tiktoken library.

positional arguments:
  FILE             Text files to count tokens in

options:
  -h, --help       show this help message and exit
  --files0-from F  Read input from the files specified by NUL-terminated names in file F
  --model MODEL    Model name to use for tokenization (default: text-davinci-003)
  -c, --clipboard  Read input from the system clipboard
  --version        show program's version number and exit

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gptwc-1.2.2.tar.gz (3.4 kB view details)

Uploaded Source

Built Distributions

gptwc-1.2.2-py3-none-any.whl (3.9 kB view details)

Uploaded Python 3

gptwc-1.2.2-py2.py3-none-any.whl (3.9 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file gptwc-1.2.2.tar.gz.

File metadata

  • Download URL: gptwc-1.2.2.tar.gz
  • Upload date:
  • Size: 3.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for gptwc-1.2.2.tar.gz
Algorithm Hash digest
SHA256 09f917d77037bc847621f19346f2923111500a2ce7a28ee77a85e0cb4d796292
MD5 4fdb0fe0df4e8cb208a941ea23d93f97
BLAKE2b-256 5aced02d2f3ca866162488418fd4d4f19a8e72ef0314a3eb23ef00ee92ba63c7

See more details on using hashes here.

File details

Details for the file gptwc-1.2.2-py3-none-any.whl.

File metadata

  • Download URL: gptwc-1.2.2-py3-none-any.whl
  • Upload date:
  • Size: 3.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for gptwc-1.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 f9561dfbe147f4e9f68cd2a6d8e4bce12dfbc5664139a6aff77748d1635ec784
MD5 a45fb55e2bcb7db2628bda5f661456c6
BLAKE2b-256 819dab880ab50d1e11f20783d985e440277ef3044b62a35d43028f39b827d224

See more details on using hashes here.

File details

Details for the file gptwc-1.2.2-py2.py3-none-any.whl.

File metadata

  • Download URL: gptwc-1.2.2-py2.py3-none-any.whl
  • Upload date:
  • Size: 3.9 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for gptwc-1.2.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 b29322b1fda96aa196e18096836a6a359d589e3953c4b6717a1ae1240d4f390e
MD5 01e735a34b824bdf8422f94bdfad6d43
BLAKE2b-256 e2cfd00a84bfd3364e9ef7e09dc949e311564e993e2f03dbfe43ec4a935a38e5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page