Truncate LLM tool output with head/tail/middle/middle_lines strategies. UTF-8 safe, zero runtime deps.
Project description
tool-output-truncate-py
Truncate tool output before adding it to LLM message history.
When an agent runs cat file.log, ripgrep, or a database query, the
result can be megabytes. Naively appending it to the conversation blows
the context window. The standard fix is to keep the head and the tail
and replace the middle with an elision marker. This is that, char-aware
(UTF-8 safe) and line-aware, with zero runtime dependencies.
Sibling to the Rust crate
tool-output-truncate.
Install
pip install tool-output-truncate-py
Use
from tool_output_truncate import truncate_middle
big = open("server.log").read()
safe_to_send = truncate_middle(big, 4000)
# "first 2000 chars\n\n[123456 chars truncated]\n\n...last 2000 chars"
Four strategies:
from tool_output_truncate import (
truncate_head,
truncate_tail,
truncate_middle,
truncate_middle_lines,
)
truncate_head(text, max_chars) # keep prefix
truncate_tail(text, max_chars) # keep suffix
truncate_middle(text, max_chars) # keep both ends (default for logs)
truncate_middle_lines(text, max_lines) # line-aware version of middle
All four are no-ops when the input already fits.
Examples
Keep the prefix (use when the tail is noise, e.g. long lists where order matters):
truncate_head("abcdefghij", 4)
# 'abcd\n\n[6 chars truncated]\n\n'
Keep the suffix (use when the head is preamble, e.g. command output with banner lines):
truncate_tail("abcdefghij", 4)
# '\n\n[6 chars truncated]\n\nghij'
Keep both ends (best default for arbitrary text where head and tail both carry signal):
truncate_middle("0123456789", 4)
# '01\n\n[6 chars truncated]\n\n89'
Line-aware middle (splits at line boundaries so you do not see half a line of JSON or a partial stack-trace frame):
text = "\n".join(f"line {i}" for i in range(100))
truncate_middle_lines(text, 4)
# 'line 0\nline 1\n[96 lines / ... chars truncated]\nline 98\nline 99'
UTF-8 is handled correctly. Python str is a sequence of codepoints,
so slicing never splits a multi-byte character. Emoji, accented Latin,
and CJK all count as one character each:
truncate_head("crab" + "\U0001f980" * 8, 6)
# keeps 6 codepoints from the start, marker reports remaining count
What it does NOT do
- No tokenization. Pass a char cap. As a rough Anthropic and OpenAI
proxy, treat
chars * 4 == tokens(so 4000 chars is about 1k tokens). - No structured truncation (JSON, YAML, XML). For JSON specifically, parse first and decide which fields to keep.
- No summarization. This is character arithmetic only.
Why a focused library
Most agent frameworks ship a custom truncator inline. They reinvent the edge cases each time: UTF-8 boundaries, odd-sized budgets, line-aware splitting that does not show half a line. This is the four-function library you grab instead.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tool_output_truncate_py-0.1.0.tar.gz.
File metadata
- Download URL: tool_output_truncate_py-0.1.0.tar.gz
- Upload date:
- Size: 6.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a4fe8269f32a805d15abd6d439a0060b9ba1acfb405b04d34f74a23f3d1563dd
|
|
| MD5 |
6b9aa5df1a1ada1b63e01c8e408d8056
|
|
| BLAKE2b-256 |
d8f56c8b00ac36f146bc52ccfc5cad9b77f8e0c48d6a8d48aabca6b1e714c0d0
|
File details
Details for the file tool_output_truncate_py-0.1.0-py3-none-any.whl.
File metadata
- Download URL: tool_output_truncate_py-0.1.0-py3-none-any.whl
- Upload date:
- Size: 5.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
06201630c2f3f4c39bf1283aa1dbb04f8099e6ad4b3db4be781e6977218bd660
|
|
| MD5 |
b7b2e89ab1cf17ba62807347ffa3fe3b
|
|
| BLAKE2b-256 |
5ffbfa0f59a85f73128fe4f225766ba820a3ba0735d6ac76c8ef1c3a6cfe8444
|