Lossless context compression for LLM agents. Fold it down, fold it back. No data dropped, prompt cache preserved.
Project description
FoldBack
Context compression for LLM agents. Fold it down, fold it back.
FoldBack shrinks what your agent sends to the model — JSON tool outputs, logs, search results — using content-preserving transforms. No row is dropped, no value is discarded, and the provider prompt cache keeps hitting.
from foldback import compress
result = compress(messages, model="gpt-4o")
result.messages # same format, fewer tokens — send these to the model
result.tokens_saved # tokens saved
result.ratio # 0.45 == 55% saved; 1.0 == nothing changed
result.transforms # e.g. ["json:columnar"]
Why it exists
Most "context compression" tools made two expensive mistakes:
- They compressed conversation history, dropping old messages — which busts the provider prompt cache on every call. On Anthropic that's a 90% discount thrown away.
- They dropped data and hoped the model would ask for it back via a retrieval tool. If it doesn't realize data is missing, you get a confidently wrong answer with no error.
FoldBack refuses both:
- Passthrough is sacred. Everything before the last
cache_controlbreakpoint is forwarded as the same objects — never copied, never re-serialized — so caches keep hitting. - Only the live zone is touched (the latest user message / tool result).
- Token-gated. A transform is applied only when it actually reduces tokens. Otherwise the original is returned unchanged.
Guarantees, stated honestly
Two transform categories with different promises:
| Content | Transform | Guarantee |
|---|---|---|
| JSON array of uniform objects | columnar (keys written once, rows as JSON arrays) | Reversible — exact round-trip, proven by property tests. restore_columnar() reconstructs the original. |
| Logs | strip ANSI, run-length-collapse consecutive identical lines to (xN) |
Normalizing — no textual content lost; removes non-semantic bytes. Not byte-reversible. |
| Plain text | trim trailing whitespace, collapse blank-line runs | Normalizing — words and punctuation untouched. |
The columnar transform only fires on uniform-schema arrays, so each row
maps back to its keys unambiguously and "1" (string) never collides with 1
(number). Mixed-schema arrays are left untouched rather than compacted lossily.
from foldback import compress, restore_columnar
# round-trip proof
compressed = compress(messages).messages
# any columnar block is exactly restorable:
# json.loads(restore_columnar(block)) == original_rows
Measured savings
Reproduce with python benchmarks/run.py --model gpt-4o (exact gpt-4o tokens):
| Workload | Before | After | Saved |
|---|---|---|---|
| API response (100 rows) | 2,803 | 1,421 | 49% |
| Build log (200 lines) | 2,729 | 499 | 82% |
| Code search (50 hits) | 1,892 | 1,159 | 39% |
No marketing numbers — these come straight from the benchmark script.
Install
The PyPI package is foldback-ai; the import name is foldback.
pip install foldback-ai # zero dependencies
pip install "foldback-ai[exact]" # + tiktoken for exact token counts
pip install "foldback-ai[anthropic]" # + Anthropic SDK for the wrapper
pip install "foldback-ai[openai]" # + OpenAI SDK for the wrapper
from foldback import compress # import name stays `foldback`
Use it
Inline:
from foldback import compress, CompressConfig
result = compress(messages, model="claude-sonnet-4-5")
# or with options:
cfg = CompressConfig(model="gpt-4o", min_savings=0.2) # require >=20% win
result = compress(messages, config=cfg)
Drop-in SDK wrappers (system prompt / tool defs stay frozen → cache-safe):
from anthropic import Anthropic
from foldback.integrations import with_anthropic
client = with_anthropic(Anthropic())
client.messages.create(model="claude-sonnet-4-5", messages=[...]) # auto-compressed
from openai import OpenAI
from foldback.integrations import with_openai
client = with_openai(OpenAI())
client.chat.completions.create(model="gpt-4o", messages=[...]) # auto-compressed
Develop
pip install -e ".[dev,exact]"
pytest # tests + coverage
ruff check foldback tests # lint
mypy foldback # strict type-check
python benchmarks/run.py # savings table
python examples/demo.py
Deliberately NOT built
A network proxy, SSE streaming parser, Bedrock/Vertex signing, message scoring / relevance, a HuggingFace compression model, lossy row-dropping with retrieval. FoldBack is a library you call before your own SDK call — so it can never corrupt the wire.
Roadmap
- Diff / patch compaction
- CSV / Markdown-table input detection
- Rust core for the columnar path (only if profiling demands it)
Author
Built and maintained by Sudarshan Chaudhari (@SUDARSHANCHAUDHARI, sunny.sudarshan@gmail.com) — SudarshanTechLabs, Bangkok.
If FoldBack saves you tokens, a ⭐ on the repo is appreciated.
Developed with Claude Code (Opus 4.8).
Acknowledgments
FoldBack's design was informed by studying chopratejas/headroom — a more ambitious context-compression project. FoldBack deliberately takes a narrower, library-only path (no proxy, lossless-first, prompt-cache-preserving) to avoid the cache-busting and lossy-retrieval pitfalls that project documented in its own realignment notes. Credit to its authors for mapping the problem space.
License
Apache 2.0 © Sudarshan Chaudhari / SudarshanTechLabs. See LICENSE and NOTICE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file foldback_ai-0.1.1.tar.gz.
File metadata
- Download URL: foldback_ai-0.1.1.tar.gz
- Upload date:
- Size: 47.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4486e9b4bf12047d8ddf9b3ac5d7092c9322b307da7e89819ec47c95fce37045
|
|
| MD5 |
0568678773e0727f4bb8989e5dd48ac1
|
|
| BLAKE2b-256 |
657816d2c5cd2107c23f6e9128931bb3d5377c040c3d5d1abc592a502e5d70d9
|
File details
Details for the file foldback_ai-0.1.1-py3-none-any.whl.
File metadata
- Download URL: foldback_ai-0.1.1-py3-none-any.whl
- Upload date:
- Size: 21.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
925df438b7cfc5f0c53ecde78d5ec95efcf5570b8dba83819a931be3de505200
|
|
| MD5 |
d9d5b420cd5271db489275de41c5eead
|
|
| BLAKE2b-256 |
747f5c62af92039bf4a09c1a5f3c34aae8dfb623b688da0995b2725d1dd3679e
|