Merlin dedup integration for LangChain - strip byte-redundant context before it reaches the LLM.
Project description
merlin-langchain
Drop-in MerlinBufferMemory for LangChain. Strips redundant text from your
chat history before it reaches the LLM, so multi-turn agents stop choking
on context-window overflow.
- Real-world demo: a coding agent fed two real lock files
(
facebook/react/yarn.lock+vercel/next.js/pnpm-lock.yaml, ~2 MB / 1 M tokens per turn) crashes vanilla LangChain on turn 2 with Gemini's400 INVALID_ARGUMENT "exceeds 1048576". WithMerlinBufferMemorythe same agent survives 6 turns and the same Gemini call returns200 OK(receipts in docs/benchmarks/langchain_2026-05-14.pdf).
Quick start (3 minutes)
1 - Install the Python package
pip install merlin-langchain
2 - Get the Merlin binary
The Python package only contains the LangChain glue. The dedup engine itself ships as a small native binary, downloaded once.
- Windows x64: download from the latest GitHub release: https://github.com/corbenicai/merlin-community/releases/latest
- Linux / macOS: native builds are landing soon - see the issues tracker for status. Until then the package falls back to vanilla LangChain behavior on those platforms (see Fallback, below).
Place the binary anywhere you like. Most users put it in ~/.merlin/:
mkdir -p ~/.merlin
mv merlin-lite-windows-x64.exe ~/.merlin/merlin.exe
3 - Tell the package where the binary lives
# Windows PowerShell
$env:MERLIN_BINARY = "$HOME\.merlin\merlin.exe"
# bash / zsh
export MERLIN_BINARY=~/.merlin/merlin
If you skip this step, the package looks in ~/.merlin/merlin[.exe] by
default. If the binary still isn't found, MerlinBufferMemory transparently
falls back to vanilla LangChain - no crash, just no optimization.
4 - Use it
from merlin_langchain import MerlinBufferMemory
from langchain.chains import ConversationChain
from langchain_openai import ChatOpenAI
memory = MerlinBufferMemory(memory_key="chat_history")
chain = ConversationChain(llm=ChatOpenAI(model="gpt-4o"), memory=memory)
chain.invoke({"input": "..."})
That's it. Your agent now silently dedupes its rolling chat history before each LLM call. No code changes elsewhere.
What you get
| Component | Drop-in replacement for |
|---|---|
MerlinBufferMemory |
langchain.memory.ConversationBufferMemory |
merlin_format_log_to_str |
langchain.agents.format_scratchpad.format_log_to_str |
Both inherit / mirror the LangChain interfaces, so they pass Pydantic
validation in Chain.memory slots and work in any chain that accepts
BaseMemory.
Async surface (aload_memory_variables, asave_context, aclear) is
implemented for use behind LangServe / FastAPI / await agent.ainvoke().
Limits (community tier)
The community binary processes up to:
| Window | Cap |
|---|---|
| Per call | 50 MB |
| Per day | 200 MB |
| Per month | 2 GB |
A single solo developer never hits these. A serious commercial pipeline hits them in 2-3 days; for higher caps see https://corbenic.ai.
What happens when a cap is reached
MerlinBufferMemory transparently falls back to vanilla LangChain
behavior. Your prompts pass through unchanged - exactly as if the
package weren't installed - and your LLM call proceeds normally.
- You'll see one
WARNINGin your logs the first time fallback kicks in. - The package will automatically retry the binary every hour (configurable
via the
MERLIN_RETRY_AFTER_Senvironment variable, minimum 60 seconds). - When the cap rolls over (daily at 00:00 UTC, monthly on the 1st), the
next retry succeeds and you'll see
INFO: Merlin dedup recovered.
This means you cannot get stuck in a degraded state because of a forgotten reset - long-running web servers self-heal across midnight UTC without restart.
Configuration
| Variable | Default | Purpose |
|---|---|---|
MERLIN_BINARY |
~/.merlin/merlin[.exe] |
Path to the binary |
MERLIN_RETRY_AFTER_S |
3600 |
Seconds to skip dedup after a cap-hit before re-probing. Min 60. |
Constructor parameters on MerlinBufferMemory:
| Param | Default | Purpose |
|---|---|---|
memory_key |
"history" |
Key under which the rendered string is returned |
keep_tail_lines |
2 |
Trailing lines preserved verbatim (the most-recent context) |
human_prefix / ai_prefix |
"Human" / "AI" |
Standard LangChain prefixes |
return_messages |
False |
If True, returns the message list instead of a string (no dedup applied; mirror of CBM behavior) |
extra_env |
None |
Optional env-var dict for the binary subprocess (advanced) |
When MerlinBufferMemory helps - and when it doesn't
Helps: multi-turn agents that re-feed tool outputs into the prompt each turn (ReAct, Cline, AutoGPT, Devin-style workflows). Anywhere the chat history accumulates large repeated content (lock files, terminal logs, file dumps, retrieved documents).
Doesn't help: single-shot LLM calls with no rolling history. Tiny prompts under a few KB. Workloads where every turn introduces only fresh unique content.
When it doesn't help, you don't pay for it - the dedup just shrinks the prompt by zero bytes.
License
MIT. See LICENSE.
Links
- GitHub: https://github.com/corbenicai/merlin-community
- Issues: https://github.com/corbenicai/merlin-community/issues
- Pro tier (no caps, multi-threaded engine, server-side validation): https://corbenic.ai
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file merlin_langchain-0.1.0.tar.gz.
File metadata
- Download URL: merlin_langchain-0.1.0.tar.gz
- Upload date:
- Size: 17.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e53ee0d80913c5d4f2c62f89491b07455fa0aadddbfa77ba89921d3be2af5830
|
|
| MD5 |
1820e8f44cb409caead182a4bdf3475e
|
|
| BLAKE2b-256 |
628964a77a87be2ed05653b643753618b7ee7748c8b167d14f3214d9e4eb0c93
|
File details
Details for the file merlin_langchain-0.1.0-py3-none-any.whl.
File metadata
- Download URL: merlin_langchain-0.1.0-py3-none-any.whl
- Upload date:
- Size: 12.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d473b67c07f6675d362067969b6211db774958a82a15e8e764a9278cbebc7d8a
|
|
| MD5 |
bfb776aaaef92334a89a059377dbd695
|
|
| BLAKE2b-256 |
f2458df44cd9b08dc62e9a037c6196510b7b785fca6db35070791d5eb65c5e10
|