Reusable LangGraph agent toolkit with memory, tools, and orchestration
Project description
langgraph-kit
A batteries-included toolkit for building production-grade LangGraph agents — persistent memory, rich tool capabilities, multi-agent orchestration, context management, slash commands, HITL, and streaming SSE, all composable and all optional.
Status: Alpha (v0.9.0). APIs may still evolve before 1.0. License: AGPL-3.0-or-later. Python: 3.11 – 3.13.
Why langgraph-kit?
LangGraph gives you the graph primitives. langgraph-kit gives you the rest of the stack — the pieces every non-trivial agent ends up re-implementing: typed memory with auto-extraction, a tool registry with risk levels and HITL, a prompt composer tuned for Anthropic prompt caching, context-pressure middleware, a slash-command dispatcher, a ready-made FastAPI router, and a fully-wired reference agent you can clone.
Every subsystem is independent. Use just the memory manager if that's all you need — or wire everything together via the reference deep agent and start overlaying your domain on top.
Features
- Multi-provider LLM factory — OpenAI, Anthropic (Claude), and Google (Gemini) via a single
build_llm()call; provider auto-detected from model name. - Persistent memory — typed (
USER/FEEDBACK/PROJECT/REFERENCE) and scoped (user/assistant/project/team) records stored in LangGraph'sStore, with agent-callable CRUD tools, semantic search, LLM-powered auto-extraction, and background consolidation. - Session notebook — per-thread structured scratch-space (current state, task spec, workflow, errors, results) that survives compaction.
- Tool registry with capability metadata — every tool carries a risk level (
READ_ONLY/MUTATING/DESTRUCTIVE), profile/worker/tag filters, and optional prompt-guidance fragments that are injected into the system prompt. - Prompt assembly with cache-aware ordering —
STABLE/VOLATILE/CONDITIONALsections composed stable-first to maximize Anthropic prompt-cache hits, plus pluggableContextProviders for dynamic runtime context. - Context pressure management — token estimation, microcompaction, full LLM-driven compaction, and a circuit breaker — applied automatically via middleware.
- Resilience middleware — completion guard (detects premature stops), empty-turn nudger, structured tool-error recovery, and a post-run backstop.
- Slash-command dispatcher — transport-independent
/help,/memory,/context,/compact,/status,/tools,/skillsbuilt in; short-circuits the LLM entirely on command matches. - Multi-agent orchestration — declarative worker definitions (
researcher/implementer/verifier), fire-and-forget async tasks, a per-thread message queue for busy threads, and a read-only coordinator profile. - Human-in-the-loop —
approve_actiontool andinterrupt_before=Truecapabilities that pause the graph via LangGraph interrupts and stream approval requests to the frontend. - Rich UI events — typed streaming events for artifacts (code, markdown, diagrams, tables), progress updates, suggestion chips, and citation cards, delivered alongside tokens over SSE.
- Skills (progressive disclosure) — agents discover and load
SKILL.mdfiles on demand instead of bloating the system prompt. - MCP integration — wrap any Model Context Protocol server's tools as native
ToolCapabilityentries. - Plugin system — drop
.pyfiles with acontribute()function into a plugins directory to extend the registry. - Ready-made FastAPI router — 11 endpoints covering streaming, invoke, thread state, message queue, HITL resume, and checkpoint branching/forking.
- Persistence out of the box —
AsyncPostgresSaver+AsyncPostgresStorefor production,AsyncSqliteSaver+InMemoryStorefor local dev, switched by thedatabase_urlscheme. - Observability — first-class Langfuse tracing, run-config builder, and per-thread token budgets.
- CLI scaffolding —
python -m langgraph_kit.cli new <agent_id>generates a complete agent template following the kit's conventions. - Evaluation framework —
evals/module with a runner, reports, and both rule-based and model-graded metrics. - Fully typed — ships with
py.typed, type-checked underbasedpyright.
The Reference Deep Agent
The toolkit ships with reference-deep-agent — a full-stack general-purpose agent that wires every kit feature together. Clone this agent as the starting point for any new domain-specific agent (see coding_agent.py for the canonical extension pattern).
It is built on the deepagents framework and the shared build_deep_agent skeleton, so you get the entire feature stack below with one call:
from langgraph_kit.graphs.reference_deep_agent import build_reference_deep_agent
graph, dispatcher = build_reference_deep_agent(checkpointer, store, mcp_tools=[])
What's wired in
Layered prompt assembly — five core sections registered at build time:
| Section | Stability | Purpose |
|---|---|---|
core_identity |
STABLE |
Agent identity and operating principles |
memory_instructions |
CONDITIONAL (memory) |
How to use persistent memory responsibly |
orchestration_instructions |
CONDITIONAL (orchestration) |
When and how to delegate to workers |
continuation_guidance |
STABLE |
When to continue vs. stop on no-progress |
ui_interaction |
STABLE |
How to use emit_progress / suggest_actions / add_citation / approve_action |
Stable sections are placed first to maximize Anthropic prompt-cache hits; volatile tool-guidance fragments and three default context providers (Thread, Memory, Tool) are appended per turn.
Full 11-middleware stack — applied in order by build_middleware_stack:
CommandMiddleware— intercepts/-prefixed user messages and short-circuits the LLM on handled commands.RuntimeStateMiddleware— populates per-turn runtime state available to other middleware.QueuedInputMiddleware— drains the per-thread message queue at the start of each turn and injects buffered messages.ToolErrorMiddleware— wraps tool calls, converts exceptions to structured errors the agent can reason about, and retries transient failures.PressureMiddleware— estimates tokens and applies the selected mitigation strategy (MICROCOMPACT,SESSION_ASSISTED,FULL_COMPACTION, orSTOPcircuit breaker at 3× compaction failures).ResultPersistenceMiddleware— offloads large tool outputs to the store to free up context.ExtractionMiddleware— runs post-turn LLM-powered memory extraction, respecting the memory taxonomy (don't memorize what's already in the repo).EmptyTurnMiddleware— nudges the model with a concrete instruction when it produces no output.CompletionGuardMiddleware— detects premature completion heuristically and challenges the agent to justify stopping.StopHooksMiddleware— runs registered stop hooks at graph end.PostRunBackstopMiddleware— final safety check after graph execution.
Worker (sub-agent) definitions — pre-composed as GENERAL_WORKERS:
| Worker | Role |
|---|---|
researcher |
Finds information, reads docs, searches code |
implementer |
Writes code, makes changes, builds features |
verifier |
Reviews changes, runs tests, validates output |
The primary agent delegates bounded work via task/start_async_task tools; each worker runs on its own thread with its own checkpointed state.
Standard tool set — registered by register_standard_tools:
- Five memory CRUD tools (
save_memory,list_memories,search_memories,update_memory,delete_memory). - UI event tools (
emit_progress,suggest_actions,add_citation,create_artifact). - HITL
approve_actionfor destructive operations. - Skills discovery (
discover_skills,get_skill_guidance). - Async-task orchestration (
start_async_task,check_async_task). - Any MCP tools passed in via
mcp_tools=.
Seven built-in slash commands — dispatched by CommandDispatcher:
| Command | Effect |
|---|---|
/help |
Lists available commands |
/memory |
Inspects persistent memory for the current scope |
/context |
Shows current context-pressure state and token estimate |
/compact |
Forces a microcompaction pass |
/status |
Reports agent / thread / pressure status |
/tools |
Lists registered tools with risk levels |
/skills |
Lists discovered skills |
Plus — composite backend factory (memories + notes + state), Langfuse observability hooks, automatic persistence across checkpointer+store, and conditional section activation for memory, orchestration, deferred_tools, skills, and async_tasks.
See docs/agents/reference-deep-agent.md for the full breakdown.
Installation
# Core package (from PyPI once published — currently pre-release from GitHub)
uv add "langgraph-kit @ git+https://github.com/allada-homelab/langgraph-kit@v0.9.0"
# The reference deep agent needs the `deepagents` extra plus one LLM provider
uv add "langgraph-kit[deepagents,anthropic] @ git+https://github.com/allada-homelab/langgraph-kit@v0.9.0"
# The full kitchen sink
uv add "langgraph-kit[all] @ git+https://github.com/allada-homelab/langgraph-kit@v0.9.0"
Optional extras
| Extra | Installs | Use when... |
|---|---|---|
openai |
langchain-openai |
using GPT models (default — also covers OpenAI-compatible endpoints) |
anthropic |
langchain-anthropic |
using claude-* models |
google |
langchain-google-genai |
using gemini-* models |
postgres |
langgraph-checkpoint-postgres |
running against PostgreSQL in production |
deepagents |
deepagents |
using reference-deep-agent or coding-agent |
mcp |
langchain-mcp-adapters |
integrating MCP servers as tools |
mcp-server |
mcp |
exposing your agent as an MCP server |
fastapi |
fastapi |
using the built-in REST router |
agui |
ag-ui-protocol |
streaming via the AG-UI protocol |
a2a |
a2a-sdk |
agent-to-agent protocol support |
langfuse |
langfuse |
enabling Langfuse tracing |
all |
everything above | local development, demos |
Quickstart
1. Configure at startup
from langgraph_kit import AgentConfig, configure
configure(AgentConfig(
llm_model="claude-sonnet-4-6", # provider auto-detected from prefix
llm_api_key="sk-ant-...",
database_url="sqlite:///checkpoints.db", # or postgresql://...
))
2. Register the built-in agents
from langgraph_kit import create_persistence
from langgraph_kit.graphs import register_all
async with create_persistence() as (checkpointer, store):
register_all(checkpointer, store, mcp_tools=[])
# echo-agent, basic-deep-agent, reference-deep-agent, coding-agent,
# and supervisor-agent are all registered.
3. Stream a conversation
import uuid
from langgraph_kit import get, stream_agent_events
graph = get("reference-deep-agent")
thread_id = str(uuid.uuid4())
config = {"configurable": {"thread_id": thread_id}}
input_data = {"messages": [{"role": "user", "content": "Hello!"}]}
async for event in stream_agent_events(graph, input_data, config):
print(event, end="")
4. Or expose everything via FastAPI
from contextlib import asynccontextmanager
from fastapi import FastAPI, Depends
from langgraph_kit import AgentConfig, configure, create_persistence
from langgraph_kit.contrib.fastapi import create_agent_router
from langgraph_kit.graphs import register_all
@asynccontextmanager
async def lifespan(app: FastAPI):
configure(AgentConfig(llm_model="claude-sonnet-4-6", llm_api_key="sk-ant-..."))
async with create_persistence() as (checkpointer, store):
register_all(checkpointer, store, mcp_tools=[])
app.state.store = store
yield
app = FastAPI(lifespan=lifespan)
app.include_router(
create_agent_router(get_current_user=your_auth_dependency),
prefix="/api/v1",
)
That's it. You now have GET /api/v1/agents/, POST /api/v1/agents/{id}/stream (SSE), POST /api/v1/agents/{id}/invoke, thread-state endpoints, a message queue, HITL resume, and checkpoint branching — full list in docs/integrations/fastapi.md.
Usage guides
Each subsystem is independent and can be used on its own. The examples below are intentionally minimal — follow the links for full docs.
Memory
Typed, scoped, persistent knowledge that survives across conversations. Five agent-callable CRUD tools, LLM-powered auto-extraction after each turn, and background consolidation to merge near-duplicates and prune stale records.
from langgraph_kit.core.memory.persistent import PersistentMemoryManager
from langgraph_kit.core.memory.models import MemoryRecord, MemoryType, MemoryScope
mgr = PersistentMemoryManager(store)
await mgr.save(MemoryRecord(
title="User prefers terse responses",
type=MemoryType.FEEDBACK,
scope=MemoryScope.USER,
summary="No trailing summaries; diff is sufficient.",
body="...why and how to apply...",
))
hits = await mgr.search("response style", scope=MemoryScope.USER)
See docs/memory/overview.md, extraction, consolidation, shared-memory, session-notebook.
Tools with capability metadata
Tools aren't just callables — they carry risk levels, profile/worker filters, tags, and prompt guidance. The registry supports filtering and compilation:
from langgraph_kit.core.tools.registry import ToolRegistry
from langgraph_kit.core.tools.capability import ToolCapability, ToolRisk
registry = ToolRegistry()
registry.register(ToolCapability(
name="delete_branch",
func=my_delete_branch_impl,
risk=ToolRisk.DESTRUCTIVE,
interrupt_before=True, # triggers HITL approval
profiles={"coding"},
worker_types={"implementer"},
prompt_guidance="Use only after confirming the branch is merged.",
))
tools = registry.compile_tools(max_risk=ToolRisk.MUTATING) # filtered
See docs/tools/overview.md, capability, registry, memory-tools, worktree-tools.
Prompt assembly
Layered sections + context providers, ordered stable-first for prompt-cache efficiency.
from langgraph_kit.core.prompt_assembly.sections import (
PromptSection, SectionStability, SectionRegistry,
)
from langgraph_kit.core.prompt_assembly.composer import PromptComposer
registry = SectionRegistry()
registry.register(PromptSection(
id="core_identity", priority=100, stability=SectionStability.STABLE,
content="You are a helpful coding assistant...",
))
composer = PromptComposer(registry, providers=[...])
system_prompt = composer.compose_sections_only(conditions={"memory", "skills"})
See docs/prompt-assembly/overview.md.
Context pressure & compaction
The PressureMonitor runs every turn; the PressureMiddleware automatically applies the chosen mitigation. Thresholds: 70% → microcompact large tool outputs, 85% → LLM-driven full compaction, 3 failures → circuit-break.
See docs/context-management/overview.md.
Slash commands
from langgraph_kit.core.commands.dispatcher import CommandDispatcher, CommandResult
dispatcher = CommandDispatcher()
async def my_handler(args: str, context) -> CommandResult:
return CommandResult(output=f"You said: {args}", handled=True)
dispatcher.register("/echo", my_handler)
The CommandMiddleware intercepts any /-prefixed user message and short-circuits the LLM on handled commands. Built-ins are registered automatically by build_command_dispatcher.
See docs/commands/overview.md.
Multi-agent orchestration
Declarative worker definitions for the deepagents task tool, plus start_async_task / check_async_task for fire-and-forget background work and a store-backed per-thread message queue for busy threads.
from langgraph_kit.core.orchestration.workers import GENERAL_WORKERS, CODING_WORKERS
# or define your own:
MY_WORKERS = [
{"name": "data-analyst", "description": "...", "system_prompt": "...", "tools": [...]},
]
See docs/orchestration/overview.md.
Human-in-the-loop
# In a tool or middleware:
from langgraph_kit.core.hitl.tools import approve_action
await approve_action(
action="delete_file",
description="Remove stale config",
context={"path": "config.yaml"},
)
# → graph pauses via interrupt, SSE emits the approval request,
# client POSTs /resume with {"responses": [{"type": "accept"}]}.
UI events & artifacts
Rich, typed events alongside token stream. Emitted via sentinel-prefixed tool outputs that the streaming layer converts to typed SSE events.
| Tool | SSE key | Use for |
|---|---|---|
create_artifact |
artifact |
Code blocks, markdown, diagrams, tables |
emit_progress |
progress |
Step-by-step progress on multi-step tasks |
suggest_actions |
suggestions |
2–4 clickable follow-up buttons |
add_citation |
citation |
Collapsible source cards for files, docs, URLs |
See docs/ui-events/overview.md.
Skills (progressive disclosure)
Drop SKILL.md files into a skills directory. Agents discover and load them on demand via discover_skills and get_skill_guidance, keeping the base system prompt small.
MCP & plugins
- MCP servers — configure via
AgentConfig.mcp_servers(JSON string) or passmcp_tools=[...]into the builder. Each tool is wrapped as a nativeToolCapability. - Python plugins — drop
.pyfiles with acontribute(registry)function intoAgentConfig.plugins_dir.
Scaffolding a new agent
uv run python -m langgraph_kit.cli new my-agent --output-dir ./agents/
Generates a complete template with prompt sections, worker definitions, tool registration, middleware stack, backend factory, and a build_graph() that follows the standard contract.
Evaluation
Rule-based and model-graded evaluation with a runner and report module — lives under src/langgraph_kit/evals/.
Architecture
src/langgraph_kit/
├── _config.py AgentConfig + configure()
├── llm.py Multi-provider LLM factory
├── persistence.py Checkpointer + Store factory
├── registry.py Agent ID → graph mapping
├── streaming.py SSE event streaming
├── observability.py Langfuse integration
├── cli.py Agent scaffolding
│
├── core/ Composable building blocks
│ ├── memory/ Persistent memory, consolidation, shared
│ ├── tools/ Capability model + registry + worktree tools
│ ├── commands/ Slash-command dispatcher
│ ├── context_management/ Pressure monitor, compaction
│ ├── prompt_assembly/ Section-based composer
│ ├── orchestration/ Workers, async tasks, queue
│ ├── resilience/ Completion guard, empty turn, tool error
│ ├── hitl/ Interrupt-based approval
│ ├── skills/ SKILL.md discovery
│ ├── plugins/ MCP + plugin loader
│ └── graph_builder/ Assembly factories
│
├── graphs/ Agent implementations
│ ├── echo_agent.py
│ ├── basic_deep_agent.py
│ ├── reference_deep_agent.py ← clone this
│ ├── coding_agent.py ← canonical extension example
│ └── supervisor_agent.py
│
├── contrib/ Optional integrations (fastapi, agui, a2a, mcp_server)
└── evals/ Evaluation framework
Full walkthrough: docs/architecture/overview.md.
Extension points
| Extension | Mechanism |
|---|---|
| New agent | Implement build_graph(checkpointer, store) and register(...) it |
| New tool | registry.register(ToolCapability(...)) |
| New command | dispatcher.register("/foo", handler) |
| New prompt section | sections.register(PromptSection(...)) |
| New context provider | Implement the ContextProvider protocol |
| New middleware | Subclass _AgentMiddleware |
| New skill | Add a SKILL.md file |
| MCP tools | Configure AgentConfig.mcp_servers |
| Python plugins | Drop a .py with contribute() into plugins_dir |
Documentation
Full docs are rendered from docs/ via MkDocs — start at docs/index.md. Highlights:
- Architecture Overview
- Quickstart
- Configuration
- Reference Deep Agent
- Public API
- SSE Event Types
- Store Namespaces
Development
git clone https://github.com/allada-homelab/langgraph-kit
cd langgraph-kit
uv sync --extra dev
# Standard loop
just test # pytest
just lint # ruff check + codespell
just fmt # ruff format
just typecheck # basedpyright
just pre-commit # all of the above
just build # hatchling sdist + wheel
Integration testing with a generated app
The test app is generated from python-template via Copier:
uv tool install copier
bash scripts/setup-testapp.sh
cd testapp && uv run pytest backend/
See CONTRIBUTING.md for the contribution workflow.
License
AGPL-3.0-or-later. If that's a problem for commercial use, open an issue to discuss licensing.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file langgraph_kit-0.9.1.tar.gz.
File metadata
- Download URL: langgraph_kit-0.9.1.tar.gz
- Upload date:
- Size: 137.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5d1aff549deeafbdb1b5cfdd5277f6c4851dcd6344e7a68b0e7d0ba9d64f7ecb
|
|
| MD5 |
184e0ccd9d78626dafcdc55b947918a4
|
|
| BLAKE2b-256 |
e7e4073e22b7918f61f3f5e4edda7cf20d0c4ad78bdc73a6df05e71375d400ef
|
Provenance
The following attestation bundles were made for langgraph_kit-0.9.1.tar.gz:
Publisher:
release.yml on allada-homelab/langgraph-kit
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
langgraph_kit-0.9.1.tar.gz -
Subject digest:
5d1aff549deeafbdb1b5cfdd5277f6c4851dcd6344e7a68b0e7d0ba9d64f7ecb - Sigstore transparency entry: 1359140187
- Sigstore integration time:
-
Permalink:
allada-homelab/langgraph-kit@2d6e53de60e7b0a381f49671cbf53b276490ea78 -
Branch / Tag:
refs/tags/v0.9.1 - Owner: https://github.com/allada-homelab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@2d6e53de60e7b0a381f49671cbf53b276490ea78 -
Trigger Event:
push
-
Statement type:
File details
Details for the file langgraph_kit-0.9.1-py3-none-any.whl.
File metadata
- Download URL: langgraph_kit-0.9.1-py3-none-any.whl
- Upload date:
- Size: 190.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a5b77bb14d9ca142f9e11f567bee9f1f8adee175421ef94d6732b564815442b0
|
|
| MD5 |
a685a0393d767fbe4fc1b4211aed5a98
|
|
| BLAKE2b-256 |
f4523d770cb11ce90cbbdd8948038c42a4c88ff6aa82fe389a27b28830d47cd5
|
Provenance
The following attestation bundles were made for langgraph_kit-0.9.1-py3-none-any.whl:
Publisher:
release.yml on allada-homelab/langgraph-kit
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
langgraph_kit-0.9.1-py3-none-any.whl -
Subject digest:
a5b77bb14d9ca142f9e11f567bee9f1f8adee175421ef94d6732b564815442b0 - Sigstore transparency entry: 1359140196
- Sigstore integration time:
-
Permalink:
allada-homelab/langgraph-kit@2d6e53de60e7b0a381f49671cbf53b276490ea78 -
Branch / Tag:
refs/tags/v0.9.1 - Owner: https://github.com/allada-homelab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@2d6e53de60e7b0a381f49671cbf53b276490ea78 -
Trigger Event:
push
-
Statement type: