Encoder/Decoder and tokens counter for GPT3
Project description
An OpenAI GPT3 helper library for encoding/decoding strings and counting tokens.
Counting tokens gives the same output as OpenAI’s tokenizer
Tested with versions: 2.7.12, 2.7.18 and all 3.x.x versions
Installing
pip install gpt3_tokenizer
Examples
Encoding/decoding a string
import gpt3_tokenizer
a_string = "That's my beautiful and sweet string"
encoded = gpt3_tokenizer.encode(a_string) # outputs [2504, 338, 616, 4950, 290, 6029, 4731]
decoded = gpt3_tokenizer.decode(encoded) # outputs "That's my beautiful and sweet string"
Counting tokens
import gpt3_tokenizer
a_string = "That's my beautiful and sweet string"
tokens_count = gpt3_tokenizer.count_tokens(a_string) # outputs 7
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
gpt3_tokenizer-0.1.3.tar.gz
(560.7 kB
view hashes)
Built Distribution
Close
Hashes for gpt3_tokenizer-0.1.3-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5af4b2b7f0ec533792cf133a66feab6c482a0434721abecc608e5b26d7b98e7f |
|
MD5 | 1b365c634f166655dd84070790da8c41 |
|
BLAKE2b-256 | 1ea79b825973eb7933cec48dfdce0db81ef6e901971f6e8bb2c440617676e2c0 |