Framework-agnostic Lossless Context Management SDK for LLM agents
Project description
OpenLCM — Lossless Context Management for LLM Agents
Unbounded memory. Bounded context.
OpenLCM is a framework-agnostic Python SDK that gives your AI agents a permanent, lossless memory — without ever hitting the context limit. Every message is persisted verbatim in SQLite and compressed into a hierarchical DAG of summaries. Nothing is ever lost. Any past moment is recoverable.
pip install openlcm
The problem
LLMs have a hard token limit. As conversations grow, agents either crash or replace old turns with a flat, irreversible summary. Details fall out permanently — decisions, constraints, file paths, tool results.
How LCM works
OpenLCM maintains two layers:
- Immutable message store — every message written verbatim to SQLite with a stable
store_id. FTS5-indexed. Never modified. - Summary DAG — older messages are compressed into D0 leaf nodes → D1 session arcs → D2 durable history. Each node points back to its source messages for exact recovery.
The model always sees: system + highest DAG node + recent D0 nodes + fresh tail (last N raw messages).
Quick start
from openlcm import LCMEngine
engine = LCMEngine(model="anthropic/claude-haiku-4-5-20251001")
engine.bind_session("my-session", context_length=200_000)
# Call before every LLM turn — compresses only when needed
messages = await engine.compress(messages)
Pass any LiteLLM model string: openai/gpt-4o, gemini/gemini-2.0-flash, azure/gpt-4o, bedrock/..., ollama/llama3, etc.
Framework adapters
All adapters are included — one install, no extras needed.
LangGraph
from openlcm.adapters.langgraph import LCMCheckpointer
graph = StateGraph(MyState).compile(
checkpointer=LCMCheckpointer(llm=my_llm)
)
Google ADK
from openlcm.adapters.google_adk import LCMSessionService, lcm_compress_callback
agent = LlmAgent(
name="assistant",
model="gemini-2.0-flash",
tools=[...],
before_model_callback=lcm_compress_callback(engine),
)
runner = Runner(agent=agent, session_service=LCMSessionService(engine))
AutoGen
from openlcm.adapters.autogen import LCMContext
agent = AssistantAgent(
"assistant",
model_client=client,
model_context=LCMContext(llm=client),
)
CrewAI
from openlcm.adapters.crewai import LCMStorage
crew = Crew(
memory=True,
long_term_memory=LongTermMemory(storage=LCMStorage(engine))
)
OpenAI / Groq / Mistral / Ollama
from openlcm.adapters.openai import OpenAIMessages
lcm = OpenAIMessages.to_lcm(messages)
if engine.should_compress_preflight(lcm):
lcm = await engine.compress(lcm)
messages = OpenAIMessages.from_lcm(lcm)
Anthropic
from openlcm.adapters.anthropic import AnthropicMessages
lcm = AnthropicMessages.to_lcm(messages, system=system_prompt)
lcm = await engine.compress(lcm)
system_out, anthropic_msgs = AnthropicMessages.from_lcm(lcm)
LlamaIndex / Haystack / Gemini
from openlcm.adapters.llamaindex import LlamaIndexMessages
from openlcm.adapters.haystack import HaystackMessages
from openlcm.adapters.gemini import GeminiMessages
All follow the same to_lcm() / from_lcm() interface.
Configuration
from openlcm.core.config import LCMConfig
config = LCMConfig.from_env()
config.context_threshold = 0.75 # compress at 75% of context window
config.fresh_tail_count = 64 # protect last 64 messages from compression
config.leaf_chunk_tokens = 20_000 # tokens per D0 leaf summary
config.condensation_fanin = 4 # D0 nodes before a D1 arc is created
engine = LCMEngine(model="...", config=config)
| Env var | Default | Description |
|---|---|---|
LCM_CONTEXT_THRESHOLD |
0.75 |
Compression trigger as fraction of context window |
LCM_FRESH_TAIL_COUNT |
64 |
Messages protected from compression at tail |
LCM_LEAF_CHUNK_TOKENS |
20000 |
Tokens per D0 summary chunk |
LCM_CONDENSATION_FANIN |
4 |
D0 nodes required before D1 arc is created |
Live dashboard
import threading
from openlcm.viz.server import create_app, serve as viz_serve
threading.Thread(
target=lambda: viz_serve(create_app(engine), port=7842, open_browser=True),
daemon=True
).start()
Or from the CLI:
openlcm viz # opens http://localhost:7842
openlcm grep "query" # full-text search across all sessions
openlcm status # session stats
The dashboard shows token pressure, DAG graph, SQLite message store, and a live event log — all updating in real time.
Guarantees
- Lossless — every message persisted with stable
store_id. Recoverable even after 100 compactions. - Deterministic — summarization always terminates. L1 → L2 → L3 escalation with circuit breaker.
- Zero-cost — compression fires only when the threshold is exceeded. Short conversations pay zero overhead.
License
MIT — see LICENSE.
Built on the LCM paper by Ehrlich & Blackman (Voltropy, 2026).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file openlcm-0.1.1.tar.gz.
File metadata
- Download URL: openlcm-0.1.1.tar.gz
- Upload date:
- Size: 144.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3d5b56e3463aed6753827b91122508b694c6482ce5761b658be052a876796783
|
|
| MD5 |
f6ed78cdfd5b66cb9127176fcae5aef6
|
|
| BLAKE2b-256 |
c34b0b964e232a677918612fbc8a43b02b1266128400a055f8d46aff4fec8027
|
File details
Details for the file openlcm-0.1.1-py3-none-any.whl.
File metadata
- Download URL: openlcm-0.1.1-py3-none-any.whl
- Upload date:
- Size: 172.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6e74ff0de2239c7505722c40204030624c19a607a00f0d6ee243be19495526b8
|
|
| MD5 |
e33385e54d90b4d6e301d0e264e33baa
|
|
| BLAKE2b-256 |
7133e7e1befcbd6be703f95d627e7d20ced07443c9b1fbdeea6804c346bbb9a4
|