Fit your messages into the LLM context window. Token-aware truncation with multiple strategies, pluggable tokenizers. Python port of @mukundakatta/agentfit.
Project description
agentfit-py
Fit your messages into the LLM context window. Token-aware truncation with multiple strategies, pluggable tokenizers. Zero runtime dependencies.
Python port of @mukundakatta/agentfit. The JS sibling has the full design notes; this README sticks to the Python API.
Install
pip install agentfit-py
Usage
from agentfit import count, fit, OverBudgetError
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello there!"},
{"role": "assistant", "content": "Hi! How can I help?"},
{"role": "user", "content": "Tell me a long story..."},
]
# Estimate tokens (heuristic; pass `tokenizer=...` to plug in tiktoken etc.)
count(messages, model="claude-sonnet-4-6") # -> int
# Drop messages until under budget. System message + recent N are preserved.
result = fit(
messages,
max_tokens=8000,
model="claude-sonnet-4-6",
strategy="drop-oldest", # or "drop-middle" / "priority"
preserve_system=True,
preserve_last_n=2,
)
result.messages # list[dict] -- survived
result.dropped # list[dict] -- removed
result.tokens # {"before": int, "after": int, "budget": int}
result.fit # True iff under budget
If the budget can't be reached even after dropping all non-protected messages, fit() raises OverBudgetError (carries the partial result). Use on_over_budget="return-partial" to return the over-budget result instead with fit=False.
Strategies
| Strategy | Behavior |
|---|---|
drop-oldest (default) |
Drop earliest non-protected message first. |
drop-middle |
Drop messages closest to the center; preserves start + recent tail. |
priority |
Drop messages with the lowest priority field first (default 0). |
Custom tokenizer
Pass any Callable[[str], int]. Example with tiktoken:
import tiktoken
enc = tiktoken.get_encoding("cl100k_base")
fit(messages, max_tokens=8000, tokenizer=lambda s: len(enc.encode(s)))
API differences from the JS sibling
count()andfit()use Python keyword args (max_tokens=,preserve_system=, etc.) instead of the JS options object.fit()returns aFitResultdataclass instead of a plain object.- No
wrapFetch/ monkey-patching equivalents -- not needed in Python.
See the JS sibling's README for the full design notes and broader algorithmic reasoning.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agentfit_py-0.1.0.tar.gz.
File metadata
- Download URL: agentfit_py-0.1.0.tar.gz
- Upload date:
- Size: 9.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c010d14b17b0216061cf1e1e4be53ef108c9211d9464a4655bfd9e2738b5d805
|
|
| MD5 |
ca2e5ca432618c37c28750a1f941c88d
|
|
| BLAKE2b-256 |
70db680042f05c8fa4d1e51602f07a67dd8c01c5738af226998b54ec74347823
|
File details
Details for the file agentfit_py-0.1.0-py3-none-any.whl.
File metadata
- Download URL: agentfit_py-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0ebe09d3d769443c8deb98c2d12c1a851ba84e14dbe58a38406eeb49bb38d202
|
|
| MD5 |
19175a054a880c40e682298ac08c13f7
|
|
| BLAKE2b-256 |
17a60dd161fc39e2a823e371ddff9ff0096ba2a03042e830d1d0b3fd9e9cdcac
|