Standalone, dependency-free rolling conversation memory (summary + buffer), inspired by LangChain's ConversationSummaryBufferMemory.

These details have not been verified by PyPI

Project links

Project description

rollmem

Standalone, dependency-free rolling conversation memory for LLM apps — a running summary plus a recent-message buffer, inspired by LangChain's ConversationSummaryBufferMemory, but with no LangChain (or any) dependency.

Handy for conversation memory, context compression, summarization, and gist-style long-chat handling — a tiny LangChain alternative when you only need the summary-buffer pattern.

Why

ConversationSummaryBufferMemory is a great pattern: keep recent turns verbatim, fold older turns into a running summary so context stays bounded. But pulling in all of LangChain just for that is heavy. rollmem extracts the idea into a tiny, provider-agnostic package. You inject how to summarize and how to count tokens — rollmem stays neutral.

Install

pip install rollmem

Usage

from rollmem import RollingMemory

def summarize(existing_summary, messages):
    # plug in any LLM here; return the new summary string
    folded = " ".join(m.content for m in messages)
    return (existing_summary + " " + folded).strip()

mem = RollingMemory(
    max_tokens=2000,
    summarize_fn=summarize,   # optional; without it, evicted turns are dropped
    # token_counter=...       # optional; defaults to a word-count estimate.
    #                         # In production inject a model-accurate counter, e.g.
    #                         # token_counter=lambda text: len(enc.encode(text))
)

mem.add_user_message("Hi, I'm planning a trip to Korea.")
mem.add_assistant_message("Great! When are you going?")

print(mem.get_context())    # -> str: summary (if any) + recent buffer, joined
print(mem.get_messages())   # -> list[Message]: summary prepended as a system turn

max_tokens is the budget for the verbatim recent-message buffer — not the running summary, and not a model's generation max_tokens (output limit). When the buffer exceeds it, the oldest turns are folded into the summary.

token_counter takes a single message's text (str) and returns an int. The default is a crude word count — fine for demos, but pass a model-accurate counter (such as tiktoken) for real token budgets.

Persistence

to_dict() / from_dict() serialize the memory state (running summary plus buffer) to and from a plain dict — you choose the storage format:

import json

raw = json.dumps(mem.to_dict())   # save anywhere: file, DB column, cache...

mem = RollingMemory.from_dict(
    json.loads(raw),
    max_tokens=2000,
    summarize_fn=summarize,        # callbacks are NOT serialized — re-inject them
    # token_counter=...
)

max_tokens and the callbacks are runtime configuration, not saved state, so you pass them again on restore. The buffer is restored verbatim; the token budget is re-applied on the next added message.

How it works

New turns go into buffer.
When buffer exceeds max_tokens, the oldest turns are folded into summary via summarize_fn (or dropped if none is provided).
get_messages() -> list[Message] returns the buffer with the summary prepended as a system turn. get_context() -> str is the string form of the same thing (prompt-ready), so the two never diverge. Neither adds a language-specific label — relabel the summary in your own prompt assembly if you need to.

Limitations

Lossy by design. Older turns are folded into the summary repeatedly, so each pass can blur or drop detail (a "telephone game" effect). Keep max_tokens large enough that anything you can't afford to lose stays in the verbatim buffer.
The summary is not bounded for you. max_tokens limits only the verbatim buffer, not the running summary. rollmem hands your summarize_fn the current summary plus the evicted turns and stores whatever it returns — so keeping the summary compact is your summarize_fn's job. If it merely concatenates, the summary (and thus get_context()) grows without limit. Prompt it to compress, or cap the summary length inside the callback.
Only as accurate as your counter. The default token counter is a rough word count; inject a model-accurate one (e.g. tiktoken) for real budgets.
In-memory by default. State lives in memory, but to_dict() / from_dict() let you persist and restore it (see Persistence). Callbacks are not serialized and must be re-injected on restore.

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.0

Jun 10, 2026

0.0.2

Jun 7, 2026

This version

0.0.1

Jun 7, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rollmem-0.0.1.tar.gz (8.2 kB view details)

Uploaded Jun 7, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rollmem-0.0.1-py3-none-any.whl (8.6 kB view details)

Uploaded Jun 7, 2026 Python 3

File details

Details for the file rollmem-0.0.1.tar.gz.

File metadata

Download URL: rollmem-0.0.1.tar.gz
Upload date: Jun 7, 2026
Size: 8.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for rollmem-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`4754c696b527e0320cc6f224c0920f9cc3b7c84a480c00e9dcfe6faa49af89df`
MD5	`193cc5529ce05d0cde3b5ba88c2ef98c`
BLAKE2b-256	`c1744ff168f8788e24caa3d93f9c26e84e53e4b5ed9fc8106771d6ae1c4798ec`

See more details on using hashes here.

File details

Details for the file rollmem-0.0.1-py3-none-any.whl.

File metadata

Download URL: rollmem-0.0.1-py3-none-any.whl
Upload date: Jun 7, 2026
Size: 8.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for rollmem-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2920b78453183a57b72ece568ad6dbfdbe110c7466b34faccaa79aa56cde9d3e`
MD5	`5ada5fad1a0623727019b9022c285dc0`
BLAKE2b-256	`f482e20cb7e05c305c8c66da56bb751cdf7d83895219c8a272edccd4a78055ec`

See more details on using hashes here.

rollmem 0.0.1

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

rollmem

Why

Install

Usage

Persistence

How it works

Limitations

License

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes