Skip to main content

Lossless context compression for LLM agents. Fold it down, fold it back. No data dropped, prompt cache preserved.

Project description

FoldBack

Context compression for LLM agents. Fold it down, fold it back.

FoldBack shrinks what your agent sends to the model — JSON tool outputs, logs, search results — using content-preserving transforms. No row is dropped, no value is discarded, and the provider prompt cache keeps hitting.

from foldback import compress

result = compress(messages, model="gpt-4o")
result.messages       # same format, fewer tokens — send these to the model
result.tokens_saved   # tokens saved
result.ratio          # 0.45 == 55% saved; 1.0 == nothing changed
result.transforms     # e.g. ["json:columnar"]

Why it exists

Most "context compression" tools made two expensive mistakes:

  1. They compressed conversation history, dropping old messages — which busts the provider prompt cache on every call. On Anthropic that's a 90% discount thrown away.
  2. They dropped data and hoped the model would ask for it back via a retrieval tool. If it doesn't realize data is missing, you get a confidently wrong answer with no error.

FoldBack refuses both:

  • Passthrough is sacred. Everything before the last cache_control breakpoint is forwarded as the same objects — never copied, never re-serialized — so caches keep hitting.
  • Only the live zone is touched (the latest user message / tool result).
  • Token-gated. A transform is applied only when it actually reduces tokens. Otherwise the original is returned unchanged.

Guarantees, stated honestly

Two transform categories with different promises:

Content Transform Guarantee
JSON array of uniform objects columnar (keys written once, rows as JSON arrays) Reversible — exact round-trip, proven by property tests. restore_columnar() reconstructs the original.
Logs strip ANSI, run-length-collapse consecutive identical lines to (xN) Normalizing — no textual content lost; removes non-semantic bytes. Not byte-reversible.
Plain text trim trailing whitespace, collapse blank-line runs Normalizing — words and punctuation untouched.

The columnar transform only fires on uniform-schema arrays, so each row maps back to its keys unambiguously and "1" (string) never collides with 1 (number). Mixed-schema arrays are left untouched rather than compacted lossily.

from foldback import compress, restore_columnar
# round-trip proof
compressed = compress(messages).messages
# any columnar block is exactly restorable:
#   json.loads(restore_columnar(block)) == original_rows

Measured savings

Reproduce with python benchmarks/run.py --model gpt-4o (exact gpt-4o tokens):

Workload Before After Saved
API response (100 rows) 2,803 1,421 49%
Build log (200 lines) 2,729 499 82%
Code search (50 hits) 1,892 1,159 39%

No marketing numbers — these come straight from the benchmark script.

Install

pip install foldback                 # zero dependencies
pip install "foldback[exact]"        # + tiktoken for exact token counts
pip install "foldback[anthropic]"    # + Anthropic SDK for the wrapper
pip install "foldback[openai]"       # + OpenAI SDK for the wrapper

Use it

Inline:

from foldback import compress, CompressConfig

result = compress(messages, model="claude-sonnet-4-5")
# or with options:
cfg = CompressConfig(model="gpt-4o", min_savings=0.2)  # require >=20% win
result = compress(messages, config=cfg)

Drop-in SDK wrappers (system prompt / tool defs stay frozen → cache-safe):

from anthropic import Anthropic
from foldback.integrations import with_anthropic

client = with_anthropic(Anthropic())
client.messages.create(model="claude-sonnet-4-5", messages=[...])  # auto-compressed

from openai import OpenAI
from foldback.integrations import with_openai

client = with_openai(OpenAI())
client.chat.completions.create(model="gpt-4o", messages=[...])     # auto-compressed

Develop

pip install -e ".[dev,exact]"
pytest                       # tests + coverage
ruff check foldback tests    # lint
mypy foldback                # strict type-check
python benchmarks/run.py     # savings table
python examples/demo.py

Deliberately NOT built

A network proxy, SSE streaming parser, Bedrock/Vertex signing, message scoring / relevance, a HuggingFace compression model, lossy row-dropping with retrieval. FoldBack is a library you call before your own SDK call — so it can never corrupt the wire.

Roadmap

  • Diff / patch compaction
  • CSV / Markdown-table input detection
  • Rust core for the columnar path (only if profiling demands it)

Apache 2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

foldback_ai-0.1.0.tar.gz (46.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

foldback_ai-0.1.0-py3-none-any.whl (20.9 kB view details)

Uploaded Python 3

File details

Details for the file foldback_ai-0.1.0.tar.gz.

File metadata

  • Download URL: foldback_ai-0.1.0.tar.gz
  • Upload date:
  • Size: 46.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for foldback_ai-0.1.0.tar.gz
Algorithm Hash digest
SHA256 55ac843378f6866910e76390477d2e2330d43baacae3033068519093336e4607
MD5 5c3a5b11e6587970ab928ba8023db87c
BLAKE2b-256 50ffafb01badf137129ed4c332fba2cb0b67d8a8c4e89d3740a6b1872f986deb

See more details on using hashes here.

File details

Details for the file foldback_ai-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: foldback_ai-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 20.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for foldback_ai-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 07004c13b0c62d93f2f6e9c7cf05e945fc4859b3430ab7d4228662372f693b51
MD5 a369b0d193c74a2e9375b9b6e7af771a
BLAKE2b-256 b6b87a62bdae08441422d1472dcf510c9680f19ea6e730b2286b7e8d21028b24

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page