Token-aware chat-history compaction — summarize old turns, keep system + recent. Zero dependencies.
Project description
chatcram
Keep a long chat history within a token budget — by summarizing the old middle and keeping the system prompt + recent turns verbatim. Tiny, zero-dependency, framework-agnostic. Bring your own summarizer.
As a conversation grows, you eventually blow past the context window. Dropping
old turns loses information; keeping everything is impossible. chatcram
collapses the older middle into a single summary while preserving what matters
most — the system prompt and the most recent turns.
from chatcram import Compactor
# `summarize` is any callable you provide — usually an LLM call
compactor = Compactor(budget=4000, summarize=my_llm_summarizer, keep_recent=1500)
result = compactor.compact(messages) # list of {"role", "content"} dicts
for m in result.messages:
print(m["role"], "->", m["content"][:60])
print(result.summarized) # True if the middle was collapsed
print(result.used_tokens) # tokens in the compacted history
What you get back:
- System messages — always kept, verbatim, at the front.
- A single summary message — the older middle, collapsed via your summarizer.
- Recent turns — the latest
keep_recenttokens, kept verbatim.
Why
- Zero dependencies. Pure Python. A fast characters-per-token heuristic by
default; plug in
tiktokenor any tokenizer for exact counts. - Bring your own summarizer. Any
str -> strcallable (an LLM call, a local model, anything). No provider lock-in, no hidden API calls. - Framework-agnostic. Works on plain message dicts — not tied to LangChain or LlamaIndex.
- Composes with contextcram. Compact the history, then pack it into a full prompt budget.
Installation
pip install chatcram
# optional: exact token counts via tiktoken
pip install "chatcram[tiktoken]"
How it works
from chatcram import Compactor
def summarize(transcript: str) -> str:
# call your LLM here; return a short summary string
return my_client.complete(f"Summarize this conversation:\n{transcript}")
compactor = Compactor(
budget=4000, # if the history exceeds this, compact it
summarize=summarize,
keep_recent=1500, # tokens of the most recent turns to keep verbatim
)
result = compactor.compact(messages)
messages = result.messages # ready to send to the model
If the history is already under budget, it's returned unchanged
(summarized=False). The most recent turn is always kept, even if it alone
exceeds keep_recent.
Pairs with contextcram
from chatcram import Compactor
from contextcram import Packer
history = Compactor(budget=3000, summarize=summarize).compact(messages).messages
ctx = (
Packer(model="gpt-4o", reserve=1500)
.add(SYSTEM_PROMPT, priority="required")
.add([f"{m['role']}: {m['content']}" for m in history], priority="high", strategy="trim")
.add(retrieved_docs, priority="medium", strategy="drop")
.fit()
)
Alternatives
Summarizing old turns isn't new, but it's almost always bundled into a framework
or a heavyweight memory platform. chatcram is the standalone, dependency-free
building block:
| Library | Approach | When to prefer it over chatcram |
|---|---|---|
LangChain ConversationSummaryBufferMemory |
Summary + buffer memory, inside LangChain | You're already all-in on LangChain |
| mem0 / Zep | Hosted "memory layer" with fact extraction + embeddings | You want long-term, retrieval-based memory |
| tokentrim | Drops messages to fit a token limit | You only need to drop, not summarize |
Choose chatcram when you want a tiny, framework-agnostic helper that
summarizes the old middle of a conversation, with your own summarizer and no
dependencies.
Development
git clone https://github.com/Waelr1985/chatcram.git
cd chatcram
uv sync
uv run pytest
uv run ruff check .
uv run mypy
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file chatcram-0.1.0.tar.gz.
File metadata
- Download URL: chatcram-0.1.0.tar.gz
- Upload date:
- Size: 8.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2f8c9b788b6732bcfe880f6f934542aff13aef7908e76b9a5d44ad4de383d377
|
|
| MD5 |
87f74a1b2c7af8b3d7984cbae6b6d34c
|
|
| BLAKE2b-256 |
b56d9e4f85c02c7ea1612dd7ab2e10fc377f228f301d56d13579fd18f7e56916
|
Provenance
The following attestation bundles were made for chatcram-0.1.0.tar.gz:
Publisher:
publish.yml on Waelr1985/chatcram
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
chatcram-0.1.0.tar.gz -
Subject digest:
2f8c9b788b6732bcfe880f6f934542aff13aef7908e76b9a5d44ad4de383d377 - Sigstore transparency entry: 1840943245
- Sigstore integration time:
-
Permalink:
Waelr1985/chatcram@bd92fe2f45de0a8192e994920f0dc959d2def89c -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/Waelr1985
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@bd92fe2f45de0a8192e994920f0dc959d2def89c -
Trigger Event:
release
-
Statement type:
File details
Details for the file chatcram-0.1.0-py3-none-any.whl.
File metadata
- Download URL: chatcram-0.1.0-py3-none-any.whl
- Upload date:
- Size: 7.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0fac5bce8568d07b0c9b6b022fe2137210646401207c82ede1f5e96a0e4c851c
|
|
| MD5 |
744918bac4393787b9de57f1b8a95150
|
|
| BLAKE2b-256 |
945fa458f6e13f4f7b80bc325e205f0024bf51e60dafaef51eb6f94bc18a5433
|
Provenance
The following attestation bundles were made for chatcram-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on Waelr1985/chatcram
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
chatcram-0.1.0-py3-none-any.whl -
Subject digest:
0fac5bce8568d07b0c9b6b022fe2137210646401207c82ede1f5e96a0e4c851c - Sigstore transparency entry: 1840943275
- Sigstore integration time:
-
Permalink:
Waelr1985/chatcram@bd92fe2f45de0a8192e994920f0dc959d2def89c -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/Waelr1985
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@bd92fe2f45de0a8192e994920f0dc959d2def89c -
Trigger Event:
release
-
Statement type: