Skip to main content

LLM utility of streaming token realtime replacement processing

Project description

TokFlow

日本語

Utility that outputs tokens generated by a large language model (LLM) with sequential replacement processing

How it works

The tokens are entered one after the other as small pieces as shown below.

["He","llo"," ","t","h","ere","!<","N","L>m","y ","nam","e"," ","is"," tokfl","ow.","<","N","L>N","ice"," to ","me","et you."]

The input tokens are output, with <NL> replaced by \n each time.

tokflow

You can specify any string to be replaced. Moreover, you can specify multiple replacement targets.

What is this library for?

I developed this for the purpose of outputting special tokens with successive replacements in sequential sentence generation using a large-scale language model, which is a generative AI, but it may also be used for other string stream processing.

Install

pip install tokflow

Usage

import time
from tokflow import TokFlow

TOKEN_GENERATOR_MOCK = ["He", "llo", " ", "t", "h", "ere", "!<", "N", "L>m", "y ", "nam", "e", " ", "is", " tokfl", "ow.",
                  "<", "N", "L>N", "ice", " to ", "me", "et you."]

# replace "<NL>" with "\n". "<NL>" is called "search target string".
# Multiple replacement conditions can be specified.
tokf = TokFlow([("<NL>", "\n")])

for input_token in TOKEN_GENERATOR_MOCK:

    output_token = tokf.put(input_token)

    # Input sequential tokens.
    # If there is a possibility that the token is a "search target string",
    # it is buffered for a while, so output_token may be empty for a while.
    print(f"{output_token}", end="", flush=True)

    # Included wait to show the sequential generation operation.
    time.sleep(0.3)


# Remember to output the remaining buffer at the very end. Buffers may be empty characters.
print(f"{tokf.flush()}", end="", flush=True)

tokflow

Processing

About Internal processing

Tokens are sequentially read in real time. The token read is combined with the tokens read so far, referred to as the "token buffer". In this sequential process, when a pre-specified string (hereafter referred to as the "search target string") appears in the token buffer, this string is replaced with another string (hereafter referred to as the "replacement string"). Since tokens are read sequentially, in the intermediate stage, a string that is unrelated to the search target string or part of the search target string accumulates in the token buffer. If the token buffer is composed in an order that cannot be a search target string, the token buffer is returned as the method's return value the moment such a determination is made. On the other hand, if the token buffer is composed in an order that could be a search target string, the return value remains an empty string until either the search target string appears or it is determined that it cannot be a search target string. In this way, by buffering until the appearance of the search target string, most sequential tokens can be displayed as they are, while replacement is delayed when necessary, enabling stream processing.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokflow-1.0.2.tar.gz (11.6 kB view details)

Uploaded Source

Built Distribution

tokflow-1.0.2-py3-none-any.whl (12.6 kB view details)

Uploaded Python 3

File details

Details for the file tokflow-1.0.2.tar.gz.

File metadata

  • Download URL: tokflow-1.0.2.tar.gz
  • Upload date:
  • Size: 11.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.9

File hashes

Hashes for tokflow-1.0.2.tar.gz
Algorithm Hash digest
SHA256 53e3de410f3d7324d7fd452bcba23c6212b9683e9defedc399e69caea8e5a5f4
MD5 1c2de0f7a209cad7f734a051e3e491dd
BLAKE2b-256 691485c116c0be897e392f62ac8ddf4881a6582ab306bf8cba9b7109fad09a48

See more details on using hashes here.

File details

Details for the file tokflow-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: tokflow-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 12.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.9

File hashes

Hashes for tokflow-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b3c8160ee03da2d228115e7c8b50f3f7b0458613e77f343ecafeb50570a1c6e6
MD5 cc2fad4bf2ba436862153fad5f0fe9f2
BLAKE2b-256 42cd9e8f0ab86f12a68645670b59b3a1b30b6a31a5ec512d4ce76ab23075da68

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page