Skip to main content

Minimize LLM tokens from Python objects — dicts, code, logs, diffs, and more.

Project description

ptk

ptk — Python Token Killer
Minimize LLM tokens from Python objects in one call
Zero dependencies • Auto type detection • 322 tests

CI
Python 3.10+
mypy strict
License

What is ptk?

ptk is a Python library that minimizes tokens before they reach an LLM. Pass in any Python object — dict, list, code, logs, diffs, text — and get back a compressed string representation.

Inspired by RTK (Rust Token Killer), but designed as a library for programmatic use, not a CLI proxy.

import ptk

ptk.minimize({"users": [{"name": "Alice", "bio": None, "age": 30}]})
# → '{"users":[{"name":"Alice","age":30}]}'

ptk(my_dict)                   # callable shorthand
ptk(my_dict, aggressive=True)  # max compression
pip install python-token-killer
# or
uv add python-token-killer

Optional: pip install python-token-killer[tiktoken] or uv add python-token-killer[tiktoken] for exact token counting.

Benchmarks

Real token counts via tiktoken (cl100k_base, same tokenizer as GPT-4 / Claude):

Benchmark                      Original  Default   Saved    Aggressive  Saved
API response (JSON)                1450      792   45.4%         782   46.1%
Python module (code)               2734     2113   22.7%         309   88.7%
Server log (58 lines)              1389     1388    0.1%         231   83.4%
50 user records (list)             2774      922   66.8%         922   66.8%
Verbose paragraph (text)            101       96    5.0%          74   26.7%
                                 ─────────────────────────────────────────────
TOTAL                             11182     7424   33.6%        2627   76.5%

Run yourself: python benchmarks/bench.py

What It Does

ptk auto-detects your input type and routes to the right minimizer:

Input Type Strategy Typical Savings
dict Null stripping, key shortening, flattening, compact JSON 30–60%
list Dedup, schema-once tabular, sampling 40–70%
Code str Comment stripping (pragma-preserving), docstring collapse, signature extraction 25–80%
Logs str Line dedup with counts, error-only filtering, stack trace preservation 60–90%
Diffs str Context folding, noise stripping 50–75%
Text str Word/phrase abbreviation, filler removal, stopword removal 10–30%

API

ptk.minimize(obj, *, aggressive=False, content_type=None, **kw) → str

Main entry point. Auto-detects type, applies the right strategy, returns a minimized string.

# auto-detect
ptk.minimize({"key": "value"})

# force content type
ptk.minimize(some_string, content_type="code")
ptk.minimize(some_string, content_type="log")

# dict output formats
ptk.minimize(data, format="kv")       # key:value lines
ptk.minimize(data, format="tabular")  # header-once tabular

# code: signatures only (huge savings)
ptk.minimize(code, content_type="code", mode="signatures")

# logs: errors only
ptk.minimize(logs, content_type="log", errors_only=True)

ptk.stats(obj, **kw) → dict

Same compression, but returns statistics:

ptk.stats(big_api_response)
# {
#   "output": "...",
#   "original_len": 4200,
#   "minimized_len": 1800,
#   "savings_pct": 57.1,
#   "content_type": "dict",
#   "original_tokens": 1050,
#   "minimized_tokens": 450,
# }

ptk(obj) — callable module

import ptk
ptk(some_dict)  # equivalent to ptk.minimize(some_dict)

Features by Minimizer

DictMinimizer

  • Strips None, "", [], {} recursively (preserves 0 and False)
  • Key shortening: descriptiondesc, timestampts, configurationcfg, etc.
  • Single-child flattening: {"a": {"b": val}}{"a.b": val} (aggressive)
  • Output formats: compact JSON (default), key-value lines, header-once tabular

ListMinimizer

  • Uniform list-of-dicts → schema-once tabular: declare fields once, one row per item
  • Primitive dedup with counts: ["a", "a", "a", "b"]a (x3)\nb
  • Large array sampling with first/last preservation (aggressive, threshold: 50)

CodeMinimizer

  • Strips comments while preserving pragmas: # noqa, # type: ignore, # TODO, # FIXME, // eslint-disable
  • Collapses multi-line docstrings to first line only
  • Signature extraction mode: pulls def, class, fn, func across Python, JS, Rust, Go
  • Normalizes blank lines and trailing whitespace

LogMinimizer

  • Consecutive duplicate line collapse with (xN) counts
  • Error-only filtering preserving: ERROR, WARN, FATAL, CRITICAL, stack traces, "failed" keyword
  • Timestamp stripping (aggressive)

DiffMinimizer

  • Folds unchanged context lines to ... N lines ...
  • Strips noise: index, old mode, new mode, similarity, Binary files (aggressive)
  • Preserves: +/- lines, @@ hunks, ---/+++ headers, \ No newline at end of file

TextMinimizer

  • Word abbreviation: implementationimpl, configurationconfig, productionprod, etc.
  • Phrase abbreviation: in order toto, due to the fact thatbecause, etc.
  • Filler removal: strips Furthermore,, Moreover,, In addition,, Additionally,
  • Stopword removal (aggressive): strips the, a, is, very, etc.

Use Cases

Agent Frameworks (LangGraph / LangChain)

import ptk

def compress_context(state):
    state["context"] = ptk.minimize(state["context"], aggressive=True)
    return state

Claude Code Skills

#!/usr/bin/env python3
import ptk, json, sys
data = json.load(open(sys.argv[1]))
print(ptk(data))

API Response Cleanup

response = requests.get("https://api.example.com/users").json()
clean = ptk.minimize(response)  # strip nulls, compact JSON

Comparison with Alternatives

Tool Approach Best For
ptk Type-detecting Python library, one-liner API Programmatic use in scripts, agents, frameworks
RTK Rust CLI proxy for shell commands Coding agents (Claude Code, OpenCode)
claw-compactor 14-stage pipeline, AST-aware Heavy-duty workspace compression
toons TOON serialization format Tabular data in LLM prompts
LLMLingua Neural prompt compression Natural language, requires GPU

Design Principles

  • Zero deps — stdlib only. tiktoken is optional for exact counts.
  • Builtins-firstfrozenset for O(1) lookups, precompiled regexes, slots=True frozen dataclasses.
  • DRY — shared strip_nullish(), dedup_lines() reused across minimizers.
  • Type-routed — O(1) detection for dicts/lists, first-2KB heuristic for strings.
  • Safe by default — aggressive mode is opt-in. Default never destroys meaning.

Development

git clone https://github.com/amahi2001/python-token-killer.git
cd python-token-killer
uv sync          # installs all dev dependencies, creates .venv automatically
make check       # lint + typecheck + 361 tests

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python_token_killer-0.1.0.tar.gz (51.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

python_token_killer-0.1.0-py3-none-any.whl (22.2 kB view details)

Uploaded Python 3

File details

Details for the file python_token_killer-0.1.0.tar.gz.

File metadata

  • Download URL: python_token_killer-0.1.0.tar.gz
  • Upload date:
  • Size: 51.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for python_token_killer-0.1.0.tar.gz
Algorithm Hash digest
SHA256 410a1ade2afa93fa18379da0569dc8e7ebc90d7518948d7c9d1ebec3c8a6fe54
MD5 7f4abb9702768a92167a8d935a340d91
BLAKE2b-256 de07fcfd05c80c99d488068b14d6721d4d97bcdf218b66e7f0b0917d599d903e

See more details on using hashes here.

File details

Details for the file python_token_killer-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: python_token_killer-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 22.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for python_token_killer-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0d15104cdb356cd9e2f820ccc7753229bf67ab06287c6d8aec36a5b3042a93f2
MD5 6e56f5f43441e5ab8ffb18a24aabffb7
BLAKE2b-256 8cd6d0598b080e286b025017fb86c4001a2a2276de72edf0b9c95308a040e9ce

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page