Middlewares for LangChain / LangGraph
Project description
๐งฉ LangMiddle โ Production Middleware for LangGraph
Supercharge your LangGraph agents with plugโandโplay memory, context management, and chat persistence.
๐ฏ Why LangMiddle?
Building production LangGraph agents? You need:
- ๐พ Persistent chat history across sessions
- ๐ง Long-term memory that remembers user preferences and context
- ๐ Semantic fact retrieval to inject relevant knowledge
- ๐๏ธ Clean message handling without tool noise
LangMiddle delivers all of this with zero boilerplateโjust add middleware to your agent.
โจ Key Features
| Feature | Description |
|---|---|
| ๐ Zero Config Start | Works out-of-the-box with in-memory SQLiteโno database setup |
| ๐ Multi-Backend Storage | Switch between SQLite, PostgreSQL, Supabase, Firebase with one parameter |
| ๐ง Semantic Memory | Automatic fact extraction, deduplication, and context injection |
| ๐ Smart Summarization | Auto-compress long conversations while preserving context |
| ๐ Production Ready | JWT auth, RLS support, type-safe APIs, comprehensive logging |
| โก LangGraph Native | Built for LangChain/LangGraph v1 middleware pattern |
๐ Table of Contents
๐ฆ Installation
Core Package (includes SQLite support):
pip install langmiddle
With Optional Backends:
# For SQLite with vector search (sqlite-vec)
pip install langmiddle[sqlite]
# For PostgreSQL
pip install langmiddle[postgres]
# For Supabase (includes PostgreSQL)
pip install langmiddle[supabase]
# For Firebase
pip install langmiddle[firebase]
# All backends + extras
pip install langmiddle[all]
๐ Quick Start
Basic Chat Persistence (SQLite)
Get started in 30 seconds:
from langchain.agents import create_agent
from langmiddle.history import ChatSaver, StorageContext
agent = create_agent(
model="openai:gpt-4o",
tools=[],
context_schema=StorageContext,
middleware=[
ChatSaver() # Uses in-memory SQLite by default
],
)
# Chat history automatically saved!
agent.invoke(
input={"messages": [{"role": "user", "content": "Hello!"}]},
context=StorageContext(
thread_id="conversation-123",
user_id="user-456"
)
)
Production Setup (Recommended)
For production apps, use StorageConfig to share settings between middleware components (like ContextEngineer for memory and ChatSaver for history).
from langchain.agents import create_agent
from langmiddle import StorageConfig
from langmiddle.history import ChatSaver, StorageContext
from langmiddle.context import ContextEngineer
# 1. Define shared configuration (e.g., for Supabase)
config = StorageConfig(
backend="supabase",
enable_facts=True,
auto_create_tables=True
)
# 2. Create agent with middleware
agent = create_agent(
model="openai:gpt-4o",
tools=[],
context_schema=StorageContext,
middleware=[
# Both components use the same config
ContextEngineer(
model="openai:gpt-4o",
embedder="openai:text-embedding-3-small",
backend=config
),
ChatSaver(backend=config)
]
)
# 3. Invoke with context
agent.invoke(
input={"messages": [{"role": "user", "content": "I'm vegan."}]},
context=StorageContext(
thread_id="thread-1",
user_id="user-1",
auth_token="your-jwt-token"
)
)
๐ How It Works
Message Flow
User Input
โ
[ToolRemover] โ Cleans tool messages (optional)
โ
[ContextEngineer.before_agent] โ Injects facts + summary
โ
๐ค LangGraph Agent
โ
[ContextEngineer.after_agent] โ Extracts new facts
โ
[ChatSaver] โ Persists conversation
โ
Response
Fact Lifecycle
Conversation โ Extraction โ Deduplication โ Embedding โ Storage
โ โ
Query & Retrieve Relevance Scoring
โ (recency + access + usage)
Context Injection โ
(adaptive detail) Combined Score
โ (70% similarity
Agent + 30% relevance)
Phase 3 Relevance Scoring:
- Recency (40%): Newer facts score higher (exponential decay over 365 days)
- Access Frequency (30%): Facts used more often get boosted
- Usage Feedback (30%): Facts appearing in agent responses are prioritized
- Adaptive Formatting: High-relevance facts (โฅ0.8) get full detail, medium (0.5-0.8) compact, low (0.3-0.5) minimal
๐ ๏ธ Available Middleware
ChatSaver - Persist Conversations
Automatically save chat histories to your database of choice.
Features:
- โ Multi-backend: SQLite, PostgreSQL, Supabase, Firebase
- โ Automatic deduplication (skips already-saved messages)
- โ Save interval control (every N turns)
- โ Custom state persistence
Example:
from langmiddle.history import ChatSaver
from langmiddle import StorageConfig
# Option 1: Simple string setup
ChatSaver(
backend="sqlite",
db_path="./chat.db",
save_interval=1
)
# Option 2: Shared config object (Recommended)
config = StorageConfig(backend="sqlite", db_path="./chat.db")
ChatSaver(backend=config)
Supported Backends:
| Backend | Use Case | Auth Required |
|---|---|---|
| SQLite | Development, local apps | โ No |
| PostgreSQL | Self-hosted production | โ No |
| Supabase | Managed PostgreSQL + RLS | โ JWT |
| Firebase | Mobile, real-time apps | โ ID token |
ToolRemover - Clean Tool Messages
Remove tool-related clutter from conversation history.
Why? Tool call messages and responses bloat chat history and aren't relevant for long-term storage.
Example:
from langmiddle.history import ToolRemover
middleware=[
ToolRemover(when="both"), # Filter before AND after agent
ChatSaver() # Clean messages are saved
]
Options:
when="before"- Filter before agent sees messageswhen="after"- Filter before saving to storagewhen="both"- Filter in both directions (recommended)
ContextEngineer - Intelligent Memory & Context
The brain of your agent โ automatic fact extraction, semantic search, and context injection.
๐ง What It Does
- Extracts Facts: Identifies user preferences, goals, and key information
- Stores Semantically: Embeds facts for similarity search
- Retrieves Contextually: Injects relevant memories based on user queries
- Auto-Summarizes: Compresses old conversations to save tokens
๐ฅ Key Features
| Feature | Description |
|---|---|
| Semantic Fact Storage | Vector-based storage with deduplication |
| Smart Extraction | Filters out ephemeral states (e.g., "user understood") |
| Namespace Organization | Hierarchical fact categories (["user", "preferences", "food"]) |
| Automatic Summarization | Configurable token thresholds |
| Atomic Query Breaking | Splits complex queries for better retrieval |
| Relevance Scoring | Dynamic scoring based on recency, access patterns, and usage feedback |
| Adaptive Formatting | Context detail adjusts based on fact relevance |
| Caching | Embeddings and core facts cached for performance |
๐ Example Usage
from langmiddle import StorageConfig
from langmiddle.context import ContextEngineer
# 1. Define configuration
config = StorageConfig(
backend="supabase",
auto_create_tables=True,
enable_facts=True
)
# 2. Initialize middleware
agent = create_agent(
model="openai:gpt-4o",
tools=[],
context_schema=StorageContext,
middleware=[
ContextEngineer(
model="openai:gpt-4o",
embedder="openai:text-embedding-3-small",
backend=config, # Pass config object
max_tokens_before_summarization=5000,
extraction_interval=3
)
],
)
# 3. Use it!
response = agent.invoke(
input={"messages": [{"role": "user", "content": "I love spicy Thai food"}]},
context=StorageContext(
thread_id="conversation-123",
user_id="user-456",
auth_token="your-jwt-token"
)
)
# Later in the conversation...
response = agent.invoke(
input={"messages": [{"role": "user", "content": "Recommend a restaurant"}]},
context=StorageContext(
thread_id="conversation-123",
user_id="user-456",
auth_token="your-jwt-token"
)
)
# Agent remembers: "user loves spicy Thai food" and uses it for recommendations!
โ๏ธ Configuration Options
ContextEngineer(
model="openai:gpt-4o", # LLM for extraction & summarization
embedder="openai:text-embedding-3-small", # Embedding model for semantic search
backend="supabase", # Storage backend
# Extraction settings
extraction_interval=3, # Extract facts every N turns
max_tokens_before_extraction=None, # Or trigger by token count
# Summarization settings
max_tokens_before_summarization=5000, # Auto-summarize at 5k tokens
# Context injection
core_namespaces=[ # Always-loaded fact categories
["user", "personal_info"],
["user", "preferences"]
],
# Relevance scoring (Phase 3)
relevance_threshold=0.3, # Minimum relevance score to inject
similarity_weight=0.7, # Weight for vector similarity
relevance_weight=0.3, # Weight for relevance score
enable_adaptive_formatting=True, # Adjust detail based on relevance
# Backend configuration
backend_kwargs={'enable_facts': True}
)
๐ What Gets Stored
Facts Examples:
[
{
"content": "User prefers concise and formal answers",
"namespace": ["user", "preferences", "communication"],
"intensity": 1.0,
"confidence": 0.97,
"language": "en"
},
{
"content": "User's name is Alex",
"namespace": ["user", "personal_info"],
"intensity": 0.9,
"confidence": 0.98,
"language": "en"
}
]
What's NOT stored (filtered out by design):
- โ Transient states: "User understands X", "User is satisfied"
- โ Single-use requests: "User wants a code example"
- โ Politeness markers: "User says thank you"
- โ Momentary emotions: "User feels frustrated right now"
๐พ Storage Backends
Comparison Guide
| Backend | Best For | Setup Complexity | Scalability | Vector Search | Auth | Cost |
|---|---|---|---|---|---|---|
| SQLite | โข Local development โข Demos โข Single-user apps |
โญ Trivial | ๐ต Single machine | โ (sqlite-vec) | None | Free |
| PostgreSQL | โข Self-hosted production โข Custom infrastructure โข Full control |
โญโญ Medium | ๐ต๐ต๐ต High (with replication) | โ (pgvector) | Custom | Infrastructure cost |
| Supabase | โข Web apps โข Multi-tenant SaaS โข Real-time features |
โญโญ Easy | ๐ต๐ต๐ต High (managed) | โ (pgvector) | JWT + RLS | Free tier + usage |
| Firebase | โข Mobile apps โข Google Cloud ecosystem โข Real-time sync |
โญโญ Easy | ๐ต๐ต๐ต Global (managed) | โ | Firebase Auth | Free tier + usage |
๐๏ธ Backend Configuration
SQLite โ Zero-config local storage with vector search
from langmiddle.history import ChatSaver
from langmiddle.context import ContextEngineer
# Basic chat storage (file-based)
ChatSaver(backend="sqlite", db_path="./chat.db")
# With semantic memory (requires sqlite-vec)
# Install: pip install langmiddle[sqlite]
store = ChatStorage.create(
"sqlite",
db_path="./chat.db",
auto_create_tables=True,
enable_facts=True # Enables vector similarity search
)
# In-memory (testing/dev)
ChatSaver(backend="sqlite", db_path=":memory:")
Features:
- โ Zero configuration required
- โ Vector similarity search via sqlite-vec
- โ Full Phase 3 relevance scoring
- โ Perfect for local development and demos
No environment variables needed!
PostgreSQL โ Self-hosted database
Environment variables (.env):
POSTGRES_CONNECTION_STRING=postgresql://user:password@localhost:5432/dbname
Python code:
from langmiddle.storage import ChatStorage
# First-time setup: create tables
store = ChatStorage.create(
"postgres",
connection_string="postgresql://user:pass@localhost:5432/db",
auto_create_tables=True
)
# In middleware
ChatSaver(backend="postgres")
Supabase โ Managed PostgreSQL with RLS
Environment variables (.env):
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_ANON_KEY=your-anon-key
# For table creation (one-time):
SUPABASE_CONNECTION_STRING=postgresql://postgres:[password]@db.[project].supabase.co:5432/postgres
Python code:
from langmiddle.context import ContextEngineer
# First-time setup: create tables
store = ChatStorage.create(
"supabase",
auto_create_tables=True,
enable_facts=True # Enable semantic memory tables
)
# In middleware
ContextEngineer(
model="openai:gpt-4o",
embedder="openai:text-embedding-3-small",
backend="supabase",
backend_kwargs={'enable_facts': True}
)
Context requirements:
context=StorageContext(
thread_id="conversation-123",
user_id="user-456",
auth_token="jwt-token-from-supabase-auth" # Required for RLS
)
Firebase โ Real-time NoSQL database
Service Account Setup:
- Download service account JSON from Firebase Console
- Set environment variable OR pass path directly
Option 1: Environment variable
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/firebase-creds.json"
Option 2: Direct path
ChatSaver(
backend="firebase",
credentials_path="./firebase-creds.json"
)
Context requirements:
context=StorageContext(
thread_id="conversation-123",
user_id="user-456",
auth_token="firebase-id-token" # From Firebase Auth
)
๐ Examples
๐ Complete Examples
1. Simple Chat Bot with Persistence
from langchain.agents import create_agent
from langmiddle.history import ChatSaver, StorageContext
agent = create_agent(
model="openai:gpt-4o",
tools=[],
context_schema=StorageContext,
middleware=[ChatSaver(backend="sqlite", db_path="./chatbot.db")]
)
# Conversation 1
agent.invoke(
{"messages": [{"role": "user", "content": "My name is Alice"}]},
context=StorageContext(thread_id="thread-1", user_id="alice")
)
# Conversation 2 (same thread - history preserved!)
agent.invoke(
{"messages": [{"role": "user", "content": "What's my name?"}]},
context=StorageContext(thread_id="thread-1", user_id="alice")
)
# Response: "Your name is Alice"
2. Agent with Tools (Clean History)
from langchain.agents import create_agent
from langmiddle.history import ChatSaver, ToolRemover, StorageContext
def search_tool(query: str) -> str:
return f"Search results for: {query}"
agent = create_agent(
model="openai:gpt-4o",
tools=[search_tool],
context_schema=StorageContext,
middleware=[
ToolRemover(when="both"), # Remove tool clutter
ChatSaver(backend="sqlite", db_path="./agent.db")
]
)
agent.invoke(
{"messages": [{"role": "user", "content": "Search for LangGraph tutorials"}]},
context=StorageContext(thread_id="thread-1", user_id="user-1")
)
# Only user/assistant messages saved, no tool call noise!
3. Production Agent with Memory (Supabase)
from langchain.agents import create_agent
from langmiddle import StorageConfig
from langmiddle.context import ContextEngineer
from langmiddle.storage import ChatStorage
import os
# 1. Define shared config
config = StorageConfig(
backend="supabase",
enable_facts=True,
auto_create_tables=True
)
# 2. One-time setup (optional, if not using auto_create_tables in config)
if os.getenv("INIT_TABLES"):
store = ChatStorage.create(**config.to_kwargs())
print("โ
Tables created!")
exit()
# 3. Create agent
agent = create_agent(
model="openai:gpt-4o",
tools=[],
context_schema=StorageContext,
middleware=[
ContextEngineer(
model="openai:gpt-4o",
embedder="openai:text-embedding-3-small",
backend=config,
max_tokens_before_summarization=5000,
extraction_interval=3
)
]
)
# 4. Use in your app
def chat(user_id: str, thread_id: str, message: str, jwt_token: str):
response = agent.invoke(
{"messages": [{"role": "user", "content": message}]},
context=StorageContext(
thread_id=thread_id,
user_id=user_id,
auth_token=jwt_token
)
)
return response["messages"][-1]["content"]
# Example usage
chat(
user_id="user-123",
thread_id="thread-456",
message="I prefer vegetarian food and hate spicy dishes",
jwt_token="eyJ..."
)
# Facts extracted: ["User prefers vegetarian food", "User dislikes spicy dishes"]
chat(
user_id="user-123",
thread_id="thread-789",
message="Recommend a restaurant",
jwt_token="eyJ..."
)
# Agent uses stored preferences to recommend vegetarian, non-spicy options!
4. Custom Configuration
from langmiddle.context import (
ContextEngineer,
ExtractionConfig,
SummarizationConfig,
ContextConfig
)
agent = create_agent(
model="openai:gpt-4o",
tools=[],
context_schema=StorageContext,
middleware=[
ContextEngineer(
model="openai:gpt-4o",
embedder="openai:text-embedding-3-small",
backend="supabase",
# Custom extraction settings
extraction_config=ExtractionConfig(
interval=5, # Extract every 5 turns
max_tokens=2000, # Or when 2k tokens accumulated
prompt="<custom prompt>" # Override extraction prompt
),
# Custom summarization settings
summarization_config=SummarizationConfig(
max_tokens=8000, # Summarize at 8k tokens
keep_ratio=0.3, # Keep last 30% of messages
prefix="## Summary:\n" # Custom prefix
),
# Custom context injection
context_config=ContextConfig(
core_namespaces=[ # Custom always-loaded categories
["user", "profile"],
["user", "preferences"],
["project", "settings"]
]
),
backend_kwargs={'enable_facts': True}
)
]
)
๐จ Architecture Highlights
- ๐ Modular Design: Mix and match middleware components
- ๐ฏ Single Responsibility: Each middleware does one thing well
- โก Performance: Embedding caching, batch operations, efficient queries
- ๐ก๏ธ Type Safety: Full Pydantic validation and type hints
- ๐ Observable: Structured logging with operation IDs and metrics
- ๐งช Testable: Clean abstractions, dependency injection
๐ค Contributing
We welcome contributions! Here's how you can help:
- ๐ Report bugs via GitHub Issues
- ๐ก Request features or improvements
- ๐ง Submit PRs for bug fixes or new features
- ๐ Improve docs with examples or clarifications
- โญ Star the repo if LangMiddle helped you!
Development Setup
git clone https://github.com/alpha-xone/langmiddle.git
cd langmiddle
pip install -e ".[dev]"
pytest
๐ License
Apache License 2.0 โ see LICENSE for details.
๐ Show Your Support
If LangMiddle saves you time or helps your project, please:
- โญ Star the repo on GitHub
- ๐ฆ Share on Twitter/X
- ๐ฌ Tell others in the LangChain community
Built with โค๏ธ for the LangGraph ecosystem
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file langmiddle-0.1.5.tar.gz.
File metadata
- Download URL: langmiddle-0.1.5.tar.gz
- Upload date:
- Size: 110.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
357f775782fc03adb9cea2abff4fce45c2ad39b917247b4e318d76326515016e
|
|
| MD5 |
5ff3e9eb819e7de7eaf69a4b7be0808b
|
|
| BLAKE2b-256 |
f7f9ab5298b64bfef0875a79c85bbbc42705fa146e3a124bd78f4d6de6165f03
|
File details
Details for the file langmiddle-0.1.5-py3-none-any.whl.
File metadata
- Download URL: langmiddle-0.1.5-py3-none-any.whl
- Upload date:
- Size: 116.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
64cb2f786e1bb33bfb9f28581a2388ed8dbf5bfb155f8c6651f64a4fca5a6ff8
|
|
| MD5 |
51431e831f46dbd53fe54764a0d7a3a2
|
|
| BLAKE2b-256 |
8365fe02b4e55ea66232fd567e187c46104044c2139de2be631f57895b9fe92a
|