Skip to main content

Python SDK for The Token Company — compress LLM prompts to reduce costs and latency

Project description

The Token Company Python SDK

Compress LLM prompts to reduce costs and latency. 100K tokens compressed in ~85ms.

Install

pip install the-token-company

Quick start

from thetokencompany import TheTokenCompany

client = TheTokenCompany(api_key="ttc-...")
result = client.compress("Your long prompt text here...", model="bear-2")

print(result.output)           # compressed text
print(result.tokens_saved)     # tokens removed
print(result.compression_ratio)  # e.g. 1.8

SDK wrappers

Drop-in wrappers that auto-compress all non-assistant messages before sending to your LLM. Assistant messages pass through unchanged so the provider's KV cache stays warm.

OpenAI / OpenRouter

from openai import OpenAI
from thetokencompany.openai import with_compression

client = with_compression(OpenAI(), compression_api_key="ttc-...")

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful assistant..."},
        {"role": "user", "content": "Summarize these results..."},
    ],
)

Works with AsyncOpenAI too — the wrapper detects async automatically.

Anthropic

from anthropic import Anthropic
from thetokencompany.anthropic import with_compression

client = with_compression(Anthropic(), compression_api_key="ttc-...")

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    system="You are a helpful assistant...",
    messages=[{"role": "user", "content": "Summarize these results..."}],
)

Both messages and the system parameter are compressed.

Async

from thetokencompany import AsyncTheTokenCompany

async with AsyncTheTokenCompany(api_key="ttc-...") as client:
    result = await client.compress("Your long prompt text...")

Models

Model Description
bear-2 Latest, recommended
bear-1.2 Previous generation
bear-1.1 Legacy
bear-1 Legacy

Aggressiveness

Control compression intensity with aggressiveness (0.0 – 1.0, default 0.5):

result = client.compress(text, model="bear-2", aggressiveness=0.8)

Gzip

Enable gzip compression of request payloads for better performance on large inputs (up to 2.2x faster on 1M+ tokens):

client = TheTokenCompany(api_key="ttc-...", gzip=True)

Protect text from compression

Use protect() to wrap content in <ttc_safe> tags — protected text passes through unchanged:

from thetokencompany import protect

prompt = f"{protect('system:')} You are a helpful assistant.\n{protect('user:')} Hello!"
result = client.compress(prompt, model="bear-2")

Response

CompressResponse fields:

Field Type Description
output str Compressed text
output_tokens int Token count after compression
input_tokens int Token count before compression
tokens_saved int Tokens removed
compression_ratio float Ratio (e.g. 1.8x)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

the_token_company-0.1.0.tar.gz (12.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

the_token_company-0.1.0-py3-none-any.whl (12.4 kB view details)

Uploaded Python 3

File details

Details for the file the_token_company-0.1.0.tar.gz.

File metadata

  • Download URL: the_token_company-0.1.0.tar.gz
  • Upload date:
  • Size: 12.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for the_token_company-0.1.0.tar.gz
Algorithm Hash digest
SHA256 bbfeca4e04e7fa77c31a0fb0edc1067988fcda79de38c2f61266466d9ea07849
MD5 ea943a532a2aee1e9a69f267bf809aa1
BLAKE2b-256 991d656cd76c79fc97b631c3ceb987f2d951b321dd7f69d111b54110183cf218

See more details on using hashes here.

File details

Details for the file the_token_company-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for the_token_company-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 299696d430d10d6d86c967e80ff607b71d73b3889209f1053b25949681aba1e4
MD5 d1e98c97fc61e17e2a094239feaed24f
BLAKE2b-256 b56e1ce0036d14827cc173fba4ab48ca3dfaf565f3287a7ddf0de4478b8b5b96

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page