Merlin dedup integration for LangChain - strip byte-redundant context before it reaches the LLM.

These details have not been verified by PyPI

Project links

Project description

merlin-langchain

Drop-in MerlinBufferMemory for LangChain. Strips redundant text from your chat history before it reaches the LLM, so multi-turn agents stop choking on context-window overflow.

Real-world demo: a coding agent fed two real lock files (facebook/react/yarn.lock + vercel/next.js/pnpm-lock.yaml, ~2 MB / 1 M tokens per turn) crashes vanilla LangChain on turn 2 with Gemini's 400 INVALID_ARGUMENT "exceeds 1048576". With MerlinBufferMemory the same agent survives 6 turns and the same Gemini call returns 200 OK (receipts in docs/benchmarks/langchain_2026-05-14.pdf).

Quick start (3 minutes)

1 - Install the Python package

pip install merlin-langchain

2 - Get the Merlin binary

The Python package only contains the LangChain glue. The dedup engine itself ships as a small native binary, downloaded once.

Windows x64: download from the latest GitHub release: https://github.com/corbenicai/merlin-community/releases/latest
Linux / macOS: native builds are landing soon - see the issues tracker for status. Until then the package falls back to vanilla LangChain behavior on those platforms (see Fallback, below).

Place the binary anywhere you like. Most users put it in ~/.merlin/:

mkdir -p ~/.merlin
mv merlin-lite-windows-x64.exe ~/.merlin/merlin.exe

3 - Tell the package where the binary lives

# Windows PowerShell
$env:MERLIN_BINARY = "$HOME\.merlin\merlin.exe"

# bash / zsh
export MERLIN_BINARY=~/.merlin/merlin

If you skip this step, the package looks in ~/.merlin/merlin[.exe] by default. If the binary still isn't found, MerlinBufferMemory transparently falls back to vanilla LangChain - no crash, just no optimization.

4 - Use it

from merlin_langchain import MerlinBufferMemory
from langchain.chains import ConversationChain
from langchain_openai import ChatOpenAI

memory = MerlinBufferMemory(memory_key="chat_history")
chain = ConversationChain(llm=ChatOpenAI(model="gpt-4o"), memory=memory)
chain.invoke({"input": "..."})

That's it. Your agent now silently dedupes its rolling chat history before each LLM call. No code changes elsewhere.

What you get

Component	Drop-in replacement for
`MerlinBufferMemory`	`langchain.memory.ConversationBufferMemory`
`merlin_format_log_to_str`	`langchain.agents.format_scratchpad.format_log_to_str`

Both inherit / mirror the LangChain interfaces, so they pass Pydantic validation in Chain.memory slots and work in any chain that accepts BaseMemory.

Async surface (aload_memory_variables, asave_context, aclear) is implemented for use behind LangServe / FastAPI / await agent.ainvoke().

Limits (community tier)

The community binary processes up to:

Window	Cap
Per call	50 MB
Per day	200 MB
Per month	2 GB

A single solo developer never hits these. A serious commercial pipeline hits them in 2-3 days; for higher caps see https://corbenic.ai.

What happens when a cap is reached

MerlinBufferMemory transparently falls back to vanilla LangChain behavior. Your prompts pass through unchanged - exactly as if the package weren't installed - and your LLM call proceeds normally.

You'll see one WARNING in your logs the first time fallback kicks in.
The package will automatically retry the binary every hour (configurable via the MERLIN_RETRY_AFTER_S environment variable, minimum 60 seconds).
When the cap rolls over (daily at 00:00 UTC, monthly on the 1st), the next retry succeeds and you'll see INFO: Merlin dedup recovered.

This means you cannot get stuck in a degraded state because of a forgotten reset - long-running web servers self-heal across midnight UTC without restart.

Configuration

Variable	Default	Purpose
`MERLIN_BINARY`	`~/.merlin/merlin[.exe]`	Path to the binary
`MERLIN_RETRY_AFTER_S`	`3600`	Seconds to skip dedup after a cap-hit before re-probing. Min 60.

Constructor parameters on MerlinBufferMemory:

Param	Default	Purpose
`memory_key`	`"history"`	Key under which the rendered string is returned
`keep_tail_lines`	`2`	Trailing lines preserved verbatim (the most-recent context)
`human_prefix` / `ai_prefix`	`"Human"` / `"AI"`	Standard LangChain prefixes
`return_messages`	`False`	If `True`, returns the message list instead of a string (no dedup applied; mirror of CBM behavior)
`extra_env`	`None`	Optional env-var dict for the binary subprocess (advanced)

When MerlinBufferMemory helps - and when it doesn't

Helps: multi-turn agents that re-feed tool outputs into the prompt each turn (ReAct, Cline, AutoGPT, Devin-style workflows). Anywhere the chat history accumulates large repeated content (lock files, terminal logs, file dumps, retrieved documents).

Doesn't help: single-shot LLM calls with no rolling history. Tiny prompts under a few KB. Workloads where every turn introduces only fresh unique content.

When it doesn't help, you don't pay for it - the dedup just shrinks the prompt by zero bytes.

License

MIT. See LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

May 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

merlin_langchain-0.1.0.tar.gz (17.9 kB view details)

Uploaded May 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

merlin_langchain-0.1.0-py3-none-any.whl (12.7 kB view details)

Uploaded May 13, 2026 Python 3

File details

Details for the file merlin_langchain-0.1.0.tar.gz.

File metadata

Download URL: merlin_langchain-0.1.0.tar.gz
Upload date: May 13, 2026
Size: 17.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for merlin_langchain-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`e53ee0d80913c5d4f2c62f89491b07455fa0aadddbfa77ba89921d3be2af5830`
MD5	`1820e8f44cb409caead182a4bdf3475e`
BLAKE2b-256	`628964a77a87be2ed05653b643753618b7ee7748c8b167d14f3214d9e4eb0c93`

See more details on using hashes here.

File details

Details for the file merlin_langchain-0.1.0-py3-none-any.whl.

File metadata

Download URL: merlin_langchain-0.1.0-py3-none-any.whl
Upload date: May 13, 2026
Size: 12.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for merlin_langchain-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d473b67c07f6675d362067969b6211db774958a82a15e8e764a9278cbebc7d8a`
MD5	`bfb776aaaef92334a89a059377dbd695`
BLAKE2b-256	`f2458df44cd9b08dc62e9a037c6196510b7b785fca6db35070791d5eb65c5e10`

See more details on using hashes here.

merlin-langchain 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

merlin-langchain

Quick start (3 minutes)

1 - Install the Python package

2 - Get the Merlin binary

3 - Tell the package where the binary lives

4 - Use it

What you get

Limits (community tier)

What happens when a cap is reached

Configuration

When MerlinBufferMemory helps - and when it doesn't

License

Links

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes