A package to count tokens in input text using OpenAI's tiktoken library.
Project description
gptwc: wc for GPT tokens
A simple utility for counting tokens.
The wc
utility counts words or characters. The gptwc
utility functions similarly but counts tokens.
Tokens are smaller than words but larger than characters.
Use gptwc
to check the number of tokens in a string, in order to remain under the token limit (eg. 4097) for your large language model API. Uses tiktoken
.
Installation
$ pip install gptwc
$ echo "Simple is better than complex." | gptwc
7
Example Usage
$ cat LICENSE | gptwc
257
$ cat LICENSE | wc -c
1059
$ cat LICENSE | wc -w
165
$ curl -s 'https://gist.githubusercontent.com/phillipj/4944029/raw/75ba2243dd5ec2875f629bf5d79f6c1e4b5a8b46/alice_in_wonderland.txt' | wc -w
26470
curl -s 'https://gist.githubusercontent.com/phillipj/4944029/raw/75ba2243dd5ec2875f629bf5d79f6c1e4b5a8b46/alice_in_wonderland.txt' | gptwc
40085
$ cat LICENSE | gptwc --model text-davinci-003
257
$ cat LICENSE | gptwc --model gpt-3.5-turbo
201
$ cat README.md | pbcopy
$ gptwc -c
517
Options
usage: gptwc [-h] [--files0-from F] [--model MODEL] [-c] [--version] [FILE ...]
Count tokens in text files using OpenAI's tiktoken library.
positional arguments:
FILE Text files to count tokens in
options:
-h, --help show this help message and exit
--files0-from F Read input from the files specified by NUL-terminated names in file F
--model MODEL Model name to use for tokenization (default: text-davinci-003)
-c, --clipboard Read input from the system clipboard
--version show program's version number and exit
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
gptwc-1.2.2.tar.gz
(3.4 kB
view details)
Built Distributions
gptwc-1.2.2-py3-none-any.whl
(3.9 kB
view details)
File details
Details for the file gptwc-1.2.2.tar.gz
.
File metadata
- Download URL: gptwc-1.2.2.tar.gz
- Upload date:
- Size: 3.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 09f917d77037bc847621f19346f2923111500a2ce7a28ee77a85e0cb4d796292 |
|
MD5 | 4fdb0fe0df4e8cb208a941ea23d93f97 |
|
BLAKE2b-256 | 5aced02d2f3ca866162488418fd4d4f19a8e72ef0314a3eb23ef00ee92ba63c7 |
File details
Details for the file gptwc-1.2.2-py3-none-any.whl
.
File metadata
- Download URL: gptwc-1.2.2-py3-none-any.whl
- Upload date:
- Size: 3.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f9561dfbe147f4e9f68cd2a6d8e4bce12dfbc5664139a6aff77748d1635ec784 |
|
MD5 | a45fb55e2bcb7db2628bda5f661456c6 |
|
BLAKE2b-256 | 819dab880ab50d1e11f20783d985e440277ef3044b62a35d43028f39b827d224 |
File details
Details for the file gptwc-1.2.2-py2.py3-none-any.whl
.
File metadata
- Download URL: gptwc-1.2.2-py2.py3-none-any.whl
- Upload date:
- Size: 3.9 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b29322b1fda96aa196e18096836a6a359d589e3953c4b6717a1ae1240d4f390e |
|
MD5 | 01e735a34b824bdf8422f94bdfad6d43 |
|
BLAKE2b-256 | e2cfd00a84bfd3364e9ef7e09dc949e311564e993e2f03dbfe43ec4a935a38e5 |