Skip to main content

Lossless context compression for LLM agents. Fold it down, fold it back. No data dropped, prompt cache preserved.

Project description

FoldBack

Context compression for LLM agents. Fold it down, fold it back.

FoldBack shrinks what your agent sends to the model — JSON tool outputs, logs, search results — using content-preserving transforms. No row is dropped, no value is discarded, and the provider prompt cache keeps hitting.

from foldback import compress

result = compress(messages, model="gpt-4o")
result.messages       # same format, fewer tokens — send these to the model
result.tokens_saved   # tokens saved
result.ratio          # 0.45 == 55% saved; 1.0 == nothing changed
result.transforms     # e.g. ["json:columnar"]

Why it exists

Most "context compression" tools made two expensive mistakes:

  1. They compressed conversation history, dropping old messages — which busts the provider prompt cache on every call. On Anthropic that's a 90% discount thrown away.
  2. They dropped data and hoped the model would ask for it back via a retrieval tool. If it doesn't realize data is missing, you get a confidently wrong answer with no error.

FoldBack refuses both:

  • Passthrough is sacred. Everything before the last cache_control breakpoint is forwarded as the same objects — never copied, never re-serialized — so caches keep hitting.
  • Only the live zone is touched (the latest user message / tool result).
  • Token-gated. A transform is applied only when it actually reduces tokens. Otherwise the original is returned unchanged.

Guarantees, stated honestly

Two transform categories with different promises:

Content Transform Guarantee
JSON array of uniform objects columnar (keys written once, rows as JSON arrays) Reversible — exact round-trip, proven by property tests. restore_columnar() reconstructs the original.
Logs strip ANSI, run-length-collapse consecutive identical lines to (xN) Normalizing — no textual content lost; removes non-semantic bytes. Not byte-reversible.
Plain text trim trailing whitespace, collapse blank-line runs Normalizing — words and punctuation untouched.

The columnar transform only fires on uniform-schema arrays, so each row maps back to its keys unambiguously and "1" (string) never collides with 1 (number). Mixed-schema arrays are left untouched rather than compacted lossily.

from foldback import compress, restore_columnar
# round-trip proof
compressed = compress(messages).messages
# any columnar block is exactly restorable:
#   json.loads(restore_columnar(block)) == original_rows

Measured savings

Reproduce with python benchmarks/run.py --model gpt-4o (exact gpt-4o tokens):

Workload Before After Saved
API response (100 rows) 2,803 1,421 49%
Build log (200 lines) 2,729 499 82%
Code search (50 hits) 1,892 1,159 39%

No marketing numbers — these come straight from the benchmark script.

Install

The PyPI package is foldback-ai; the import name is foldback.

pip install foldback-ai                 # zero dependencies
pip install "foldback-ai[exact]"        # + tiktoken for exact token counts
pip install "foldback-ai[anthropic]"    # + Anthropic SDK for the wrapper
pip install "foldback-ai[openai]"       # + OpenAI SDK for the wrapper
from foldback import compress           # import name stays `foldback`

Use it

Inline:

from foldback import compress, CompressConfig

result = compress(messages, model="claude-sonnet-4-5")
# or with options:
cfg = CompressConfig(model="gpt-4o", min_savings=0.2)  # require >=20% win
result = compress(messages, config=cfg)

Drop-in SDK wrappers (system prompt / tool defs stay frozen → cache-safe):

from anthropic import Anthropic
from foldback.integrations import with_anthropic

client = with_anthropic(Anthropic())
client.messages.create(model="claude-sonnet-4-5", messages=[...])  # auto-compressed

from openai import OpenAI
from foldback.integrations import with_openai

client = with_openai(OpenAI())
client.chat.completions.create(model="gpt-4o", messages=[...])     # auto-compressed

Develop

pip install -e ".[dev,exact]"
pytest                       # tests + coverage
ruff check foldback tests    # lint
mypy foldback                # strict type-check
python benchmarks/run.py     # savings table
python examples/demo.py

Deliberately NOT built

A network proxy, SSE streaming parser, Bedrock/Vertex signing, message scoring / relevance, a HuggingFace compression model, lossy row-dropping with retrieval. FoldBack is a library you call before your own SDK call — so it can never corrupt the wire.

Roadmap

  • Diff / patch compaction
  • CSV / Markdown-table input detection
  • Rust core for the columnar path (only if profiling demands it)

Author

Built and maintained by Sudarshan Chaudhari (@SUDARSHANCHAUDHARI, sunny.sudarshan@gmail.com) — SudarshanTechLabs, Bangkok.

If FoldBack saves you tokens, a ⭐ on the repo is appreciated.

Developed with Claude Code (Opus 4.8).

Acknowledgments

FoldBack's design was informed by studying chopratejas/headroom — a more ambitious context-compression project. FoldBack deliberately takes a narrower, library-only path (no proxy, lossless-first, prompt-cache-preserving) to avoid the cache-busting and lossy-retrieval pitfalls that project documented in its own realignment notes. Credit to its authors for mapping the problem space.

License

Apache 2.0 © Sudarshan Chaudhari / SudarshanTechLabs. See LICENSE and NOTICE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

foldback_ai-0.1.1.tar.gz (47.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

foldback_ai-0.1.1-py3-none-any.whl (21.4 kB view details)

Uploaded Python 3

File details

Details for the file foldback_ai-0.1.1.tar.gz.

File metadata

  • Download URL: foldback_ai-0.1.1.tar.gz
  • Upload date:
  • Size: 47.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for foldback_ai-0.1.1.tar.gz
Algorithm Hash digest
SHA256 4486e9b4bf12047d8ddf9b3ac5d7092c9322b307da7e89819ec47c95fce37045
MD5 0568678773e0727f4bb8989e5dd48ac1
BLAKE2b-256 657816d2c5cd2107c23f6e9128931bb3d5377c040c3d5d1abc592a502e5d70d9

See more details on using hashes here.

File details

Details for the file foldback_ai-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: foldback_ai-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 21.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for foldback_ai-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 925df438b7cfc5f0c53ecde78d5ec95efcf5570b8dba83819a931be3de505200
MD5 d9d5b420cd5271db489275de41c5eead
BLAKE2b-256 747f5c62af92039bf4a09c1a5f3c34aae8dfb623b688da0995b2725d1dd3679e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page