A package to count tokens in input text using OpenAI's tiktoken library.
Project description
gptwc: wc for GPT tokens
A simple utility for counting tokens.
It's like wc
which counts words, except it uses tiktoken
to count tokens.
It's useful for checking the number of tokens in a string, in order to remain under the token limit (eg. 4097 for the GPT3 API)
usage: gptwc [-h] [--files0-from F] [--model MODEL] [-c] [--version] [FILE ...]
Count tokens in text files using OpenAI's tiktoken library.
positional arguments:
FILE Text files to count tokens in
options:
-h, --help show this help message and exit
--files0-from F Read input from the files specified by NUL-terminated names in file F
--model MODEL Model name to use for tokenization (default: text-davinci-003)
-c, --clipboard Read input from the system clipboard
--version show program's version number and exit
Example Usage:
$ cat README.md | wc -w
54
$ cat README.md | gptwc
180
$ curl -s 'https://gist.githubusercontent.com/phillipj/4944029/raw/75ba2243dd5ec2875f629bf5d79f6c1e4b5a8b46/alice_in_wonderland.txt' | wc -w
26470
curl -s 'https://gist.githubusercontent.com/phillipj/4944029/raw/75ba2243dd5ec2875f629bf5d79f6c1e4b5a8b46/alice_in_wonderland.txt' | gptwc
40085
$ cat README.md | gptwc --model text-davinci-003
517
$ cat README.md | gptwc --model gpt-3.5-turbo
434
$ cat README.md | pbcopy
$ gptwc -c
517
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
gptwc-1.2.1.tar.gz
(3.3 kB
view hashes)
Built Distributions
gptwc-1.2.1-py3-none-any.whl
(3.8 kB
view hashes)
Close
Hashes for gptwc-1.2.1-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0b8e4d4350aed617292cdac9063b077965dbf78dafde2f3f9b7631e532d768b8 |
|
MD5 | 2032f29307a648ef5030818ac65d6d60 |
|
BLAKE2b-256 | be3f78ac1f27ca540ad6ba778b22b1b13243181664de3abe380481b5faa3f1de |