Skip to main content

Reusable LangGraph agent toolkit with memory, tools, and orchestration

Project description

langgraph-kit

A batteries-included toolkit for building production-grade LangGraph agents — persistent memory, rich tool capabilities, multi-agent orchestration, context management, slash commands, HITL, and streaming SSE, all composable and all optional.

Status: Alpha (v0.9.0). APIs may still evolve before 1.0. License: AGPL-3.0-or-later. Python: 3.11 – 3.13.

Why langgraph-kit?

LangGraph gives you the graph primitives. langgraph-kit gives you the rest of the stack — the pieces every non-trivial agent ends up re-implementing: typed memory with auto-extraction, a tool registry with risk levels and HITL, a prompt composer tuned for Anthropic prompt caching, context-pressure middleware, a slash-command dispatcher, a ready-made FastAPI router, and a fully-wired reference agent you can clone.

Every subsystem is independent. Use just the memory manager if that's all you need — or wire everything together via the reference deep agent and start overlaying your domain on top.

Features

  • Multi-provider LLM factory — OpenAI, Anthropic (Claude), and Google (Gemini) via a single build_llm() call; provider auto-detected from model name.
  • Persistent memory — typed (USER / FEEDBACK / PROJECT / REFERENCE) and scoped (user / assistant / project / team) records stored in LangGraph's Store, with agent-callable CRUD tools, semantic search, LLM-powered auto-extraction, and background consolidation.
  • Session notebook — per-thread structured scratch-space (current state, task spec, workflow, errors, results) that survives compaction.
  • Tool registry with capability metadata — every tool carries a risk level (READ_ONLY / MUTATING / DESTRUCTIVE), profile/worker/tag filters, and optional prompt-guidance fragments that are injected into the system prompt.
  • Prompt assembly with cache-aware orderingSTABLE / VOLATILE / CONDITIONAL sections composed stable-first to maximize Anthropic prompt-cache hits, plus pluggable ContextProviders for dynamic runtime context.
  • Context pressure management — token estimation, microcompaction, full LLM-driven compaction, and a circuit breaker — applied automatically via middleware.
  • Resilience middleware — completion guard (detects premature stops), empty-turn nudger, structured tool-error recovery, and a post-run backstop.
  • Slash-command dispatcher — transport-independent /help, /memory, /context, /compact, /status, /tools, /skills built in; short-circuits the LLM entirely on command matches.
  • Multi-agent orchestration — declarative worker definitions (researcher / implementer / verifier), fire-and-forget async tasks, a per-thread message queue for busy threads, and a read-only coordinator profile.
  • Human-in-the-loopapprove_action tool and interrupt_before=True capabilities that pause the graph via LangGraph interrupts and stream approval requests to the frontend.
  • Rich UI events — typed streaming events for artifacts (code, markdown, diagrams, tables), progress updates, suggestion chips, and citation cards, delivered alongside tokens over SSE.
  • Skills (progressive disclosure) — agents discover and load SKILL.md files on demand instead of bloating the system prompt.
  • MCP integration — wrap any Model Context Protocol server's tools as native ToolCapability entries.
  • Plugin system — drop .py files with a contribute() function into a plugins directory to extend the registry.
  • Ready-made FastAPI router — 11 endpoints covering streaming, invoke, thread state, message queue, HITL resume, and checkpoint branching/forking.
  • Persistence out of the boxAsyncPostgresSaver + AsyncPostgresStore for production, AsyncSqliteSaver + InMemoryStore for local dev, switched by the database_url scheme.
  • Observability — first-class Langfuse tracing, run-config builder, and per-thread token budgets.
  • CLI scaffoldingpython -m langgraph_kit.cli new <agent_id> generates a complete agent template following the kit's conventions.
  • Evaluation frameworkevals/ module with a runner, reports, and both rule-based and model-graded metrics.
  • Fully typed — ships with py.typed, type-checked under basedpyright.

The Reference Deep Agent

The toolkit ships with reference-deep-agent — a full-stack general-purpose agent that wires every kit feature together. Clone this agent as the starting point for any new domain-specific agent (see coding_agent.py for the canonical extension pattern).

It is built on the deepagents framework and the shared build_deep_agent skeleton, so you get the entire feature stack below with one call:

from langgraph_kit.graphs.reference_deep_agent import build_reference_deep_agent

graph, dispatcher = build_reference_deep_agent(checkpointer, store, mcp_tools=[])

⚠️ Recursion limit defaults to 100 — not LangGraph's native 25. A full-stack deep agent easily burns through 25 supersteps on a single real task (middleware passes, worker round-trips, tool loops), so every deep agent built by this kit binds recursion_limit=100 via .with_config(). Raise it for long autonomous runs — pass recursion_limit=500 to any build_*_deep_agent / build_coding_agent call, or override per-run with config={"recursion_limit": 500} on ainvoke / astream_events. The constant lives at langgraph_kit.graphs.DEFAULT_RECURSION_LIMIT.

What's wired in

Layered prompt assembly — five core sections registered at build time:

Section Stability Purpose
core_identity STABLE Agent identity and operating principles
memory_instructions CONDITIONAL (memory) How to use persistent memory responsibly
orchestration_instructions CONDITIONAL (orchestration) When and how to delegate to workers
continuation_guidance STABLE When to continue vs. stop on no-progress
ui_interaction STABLE How to use emit_progress / suggest_actions / add_citation / approve_action

Stable sections are placed first to maximize Anthropic prompt-cache hits; volatile tool-guidance fragments and three default context providers (Thread, Memory, Tool) are appended per turn.

Full 11-middleware stack — applied in order by build_middleware_stack:

  1. CommandMiddleware — intercepts /-prefixed user messages and short-circuits the LLM on handled commands.
  2. RuntimeStateMiddleware — populates per-turn runtime state available to other middleware.
  3. QueuedInputMiddleware — drains the per-thread message queue at the start of each turn and injects buffered messages.
  4. ToolErrorMiddleware — wraps tool calls, converts exceptions to structured errors the agent can reason about, and retries transient failures.
  5. PressureMiddleware — estimates tokens and applies the selected mitigation strategy (MICROCOMPACT, SESSION_ASSISTED, FULL_COMPACTION, or STOP circuit breaker at 3× compaction failures).
  6. ResultPersistenceMiddleware — offloads large tool outputs to the store to free up context.
  7. ExtractionMiddleware — runs post-turn LLM-powered memory extraction, respecting the memory taxonomy (don't memorize what's already in the repo).
  8. EmptyTurnMiddleware — nudges the model with a concrete instruction when it produces no output.
  9. CompletionGuardMiddleware — detects premature completion heuristically and challenges the agent to justify stopping.
  10. StopHooksMiddleware — runs registered stop hooks at graph end.
  11. PostRunBackstopMiddleware — final safety check after graph execution.

Worker (sub-agent) definitions — pre-composed as GENERAL_WORKERS:

Worker Role
researcher Finds information, reads docs, searches code
implementer Writes code, makes changes, builds features
verifier Reviews changes, runs tests, validates output

The primary agent delegates bounded work via task/start_async_task tools; each worker runs on its own thread with its own checkpointed state.

Standard tool set — registered by register_standard_tools:

  • Five memory CRUD tools (save_memory, list_memories, search_memories, update_memory, delete_memory).
  • UI event tools (emit_progress, suggest_actions, add_citation, create_artifact).
  • HITL approve_action for destructive operations.
  • Skills discovery (discover_skills, get_skill_guidance).
  • Async-task orchestration (start_async_task, check_async_task).
  • Any MCP tools passed in via mcp_tools=.

Seven built-in slash commands — dispatched by CommandDispatcher:

Command Effect
/help Lists available commands
/memory Inspects persistent memory for the current scope
/context Shows current context-pressure state and token estimate
/compact Forces a microcompaction pass
/status Reports agent / thread / pressure status
/tools Lists registered tools with risk levels
/skills Lists discovered skills

Plus — composite backend factory (memories + notes + state), Langfuse observability hooks, automatic persistence across checkpointer+store, and conditional section activation for memory, orchestration, deferred_tools, skills, and async_tasks.

See docs/agents/reference-deep-agent.md for the full breakdown.

Installation

# Core package (from PyPI once published — currently pre-release from GitHub)
uv add "langgraph-kit @ git+https://github.com/allada-homelab/langgraph-kit@v0.9.0"

# The reference deep agent needs the `deepagents` extra plus one LLM provider
uv add "langgraph-kit[deepagents,anthropic] @ git+https://github.com/allada-homelab/langgraph-kit@v0.9.0"

# The full kitchen sink
uv add "langgraph-kit[all] @ git+https://github.com/allada-homelab/langgraph-kit@v0.9.0"

Optional extras

Extra Installs Use when...
openai langchain-openai using GPT models (default — also covers OpenAI-compatible endpoints)
anthropic langchain-anthropic using claude-* models
google langchain-google-genai using gemini-* models
postgres langgraph-checkpoint-postgres running against PostgreSQL in production
deepagents deepagents using reference-deep-agent or coding-agent
mcp langchain-mcp-adapters integrating MCP servers as tools
mcp-server mcp exposing your agent as an MCP server
fastapi fastapi using the built-in REST router
agui ag-ui-protocol streaming via the AG-UI protocol
a2a a2a-sdk agent-to-agent protocol support
langfuse langfuse enabling Langfuse tracing
all everything above local development, demos

Quickstart

1. Configure at startup

from langgraph_kit import AgentConfig, configure

configure(AgentConfig(
    llm_model="claude-sonnet-4-6",          # provider auto-detected from prefix
    llm_api_key="sk-ant-...",
    database_url="sqlite:///checkpoints.db",  # or postgresql://...
))

2. Register the built-in agents

from langgraph_kit import create_persistence
from langgraph_kit.graphs import register_all

async with create_persistence() as (checkpointer, store):
    register_all(checkpointer, store, mcp_tools=[])
    # echo-agent, basic-deep-agent, reference-deep-agent, coding-agent,
    # and supervisor-agent are all registered.

3. Stream a conversation

import uuid
from langgraph_kit import get, stream_agent_events

graph = get("reference-deep-agent")
thread_id = str(uuid.uuid4())
config = {"configurable": {"thread_id": thread_id}}
input_data = {"messages": [{"role": "user", "content": "Hello!"}]}

async for event in stream_agent_events(graph, input_data, config):
    print(event, end="")

4. Or expose everything via FastAPI

from contextlib import asynccontextmanager
from fastapi import FastAPI, Depends
from langgraph_kit import AgentConfig, configure, create_persistence
from langgraph_kit.contrib.fastapi import create_agent_router
from langgraph_kit.graphs import register_all

@asynccontextmanager
async def lifespan(app: FastAPI):
    configure(AgentConfig(llm_model="claude-sonnet-4-6", llm_api_key="sk-ant-..."))
    async with create_persistence() as (checkpointer, store):
        register_all(checkpointer, store, mcp_tools=[])
        app.state.store = store
        yield

app = FastAPI(lifespan=lifespan)
app.include_router(
    create_agent_router(get_current_user=your_auth_dependency),
    prefix="/api/v1",
)

That's it. You now have GET /api/v1/agents/, POST /api/v1/agents/{id}/stream (SSE), POST /api/v1/agents/{id}/invoke, thread-state endpoints, a message queue, HITL resume, and checkpoint branching — full list in docs/integrations/fastapi.md.

Usage guides

Each subsystem is independent and can be used on its own. The examples below are intentionally minimal — follow the links for full docs.

Memory

Typed, scoped, persistent knowledge that survives across conversations. Five agent-callable CRUD tools, LLM-powered auto-extraction after each turn, and background consolidation to merge near-duplicates and prune stale records.

from langgraph_kit.core.memory.persistent import PersistentMemoryManager
from langgraph_kit.core.memory.models import MemoryRecord, MemoryType, MemoryScope

mgr = PersistentMemoryManager(store)
await mgr.save(MemoryRecord(
    title="User prefers terse responses",
    type=MemoryType.FEEDBACK,
    scope=MemoryScope.USER,
    summary="No trailing summaries; diff is sufficient.",
    body="...why and how to apply...",
))
hits = await mgr.search("response style", scope=MemoryScope.USER)

See docs/memory/overview.md, extraction, consolidation, shared-memory, session-notebook.

Tools with capability metadata

Tools aren't just callables — they carry risk levels, profile/worker filters, tags, and prompt guidance. The registry supports filtering and compilation:

from langgraph_kit.core.tools.registry import ToolRegistry
from langgraph_kit.core.tools.capability import ToolCapability, ToolRisk

registry = ToolRegistry()
registry.register(ToolCapability(
    name="delete_branch",
    func=my_delete_branch_impl,
    risk=ToolRisk.DESTRUCTIVE,
    interrupt_before=True,              # triggers HITL approval
    profiles={"coding"},
    worker_types={"implementer"},
    prompt_guidance="Use only after confirming the branch is merged.",
))
tools = registry.compile_tools(max_risk=ToolRisk.MUTATING)  # filtered

See docs/tools/overview.md, capability, registry, memory-tools, worktree-tools.

Prompt assembly

Layered sections + context providers, ordered stable-first for prompt-cache efficiency.

from langgraph_kit.core.prompt_assembly.sections import (
    PromptSection, SectionStability, SectionRegistry,
)
from langgraph_kit.core.prompt_assembly.composer import PromptComposer

registry = SectionRegistry()
registry.register(PromptSection(
    id="core_identity", priority=100, stability=SectionStability.STABLE,
    content="You are a helpful coding assistant...",
))
composer = PromptComposer(registry, providers=[...])
system_prompt = composer.compose_sections_only(conditions={"memory", "skills"})

See docs/prompt-assembly/overview.md.

Context pressure & compaction

The PressureMonitor runs every turn; the PressureMiddleware automatically applies the chosen mitigation. Thresholds: 70% → microcompact large tool outputs, 85% → LLM-driven full compaction, 3 failures → circuit-break.

See docs/context-management/overview.md.

Slash commands

from langgraph_kit.core.commands.dispatcher import CommandDispatcher, CommandResult

dispatcher = CommandDispatcher()

async def my_handler(args: str, context) -> CommandResult:
    return CommandResult(output=f"You said: {args}", handled=True)

dispatcher.register("/echo", my_handler)

The CommandMiddleware intercepts any /-prefixed user message and short-circuits the LLM on handled commands. Built-ins are registered automatically by build_command_dispatcher.

See docs/commands/overview.md.

Multi-agent orchestration

Declarative worker definitions for the deepagents task tool, plus start_async_task / check_async_task for fire-and-forget background work and a store-backed per-thread message queue for busy threads.

from langgraph_kit.core.orchestration.workers import GENERAL_WORKERS, CODING_WORKERS
# or define your own:
MY_WORKERS = [
    {"name": "data-analyst", "description": "...", "system_prompt": "...", "tools": [...]},
]

See docs/orchestration/overview.md.

Human-in-the-loop

# In a tool or middleware:
from langgraph_kit.core.hitl.tools import approve_action

await approve_action(
    action="delete_file",
    description="Remove stale config",
    context={"path": "config.yaml"},
)
# → graph pauses via interrupt, SSE emits the approval request,
# client POSTs /resume with {"responses": [{"type": "accept"}]}.

See docs/hitl/overview.md.

UI events & artifacts

Rich, typed events alongside token stream. Emitted via sentinel-prefixed tool outputs that the streaming layer converts to typed SSE events.

Tool SSE key Use for
create_artifact artifact Code blocks, markdown, diagrams, tables
emit_progress progress Step-by-step progress on multi-step tasks
suggest_actions suggestions 2–4 clickable follow-up buttons
add_citation citation Collapsible source cards for files, docs, URLs

See docs/ui-events/overview.md.

Skills (progressive disclosure)

Drop SKILL.md files into a skills directory. Agents discover and load them on demand via discover_skills and get_skill_guidance, keeping the base system prompt small.

See docs/skills/overview.md.

MCP & plugins

  • MCP servers — configure via AgentConfig.mcp_servers (JSON string) or pass mcp_tools=[...] into the builder. Each tool is wrapped as a native ToolCapability.
  • Python plugins — drop .py files with a contribute(registry) function into AgentConfig.plugins_dir.

See docs/plugins/overview.md.

Scaffolding a new agent

uv run python -m langgraph_kit.cli new my-agent --output-dir ./agents/

Generates a complete template with prompt sections, worker definitions, tool registration, middleware stack, backend factory, and a build_graph() that follows the standard contract.

See docs/cli/reference.md.

Evaluation

Rule-based and model-graded evaluation with a runner and report module — lives under src/langgraph_kit/evals/.

See docs/evals/overview.md.

Architecture

src/langgraph_kit/
├── _config.py        AgentConfig + configure()
├── llm.py            Multi-provider LLM factory
├── persistence.py    Checkpointer + Store factory
├── registry.py       Agent ID → graph mapping
├── streaming.py      SSE event streaming
├── observability.py  Langfuse integration
├── cli.py            Agent scaffolding
│
├── core/             Composable building blocks
│   ├── memory/       Persistent memory, consolidation, shared
│   ├── tools/        Capability model + registry + worktree tools
│   ├── commands/     Slash-command dispatcher
│   ├── context_management/  Pressure monitor, compaction
│   ├── prompt_assembly/     Section-based composer
│   ├── orchestration/       Workers, async tasks, queue
│   ├── resilience/          Completion guard, empty turn, tool error
│   ├── hitl/                Interrupt-based approval
│   ├── skills/              SKILL.md discovery
│   ├── plugins/             MCP + plugin loader
│   └── graph_builder/       Assembly factories
│
├── graphs/           Agent implementations
│   ├── echo_agent.py
│   ├── basic_deep_agent.py
│   ├── reference_deep_agent.py  ← clone this
│   ├── coding_agent.py          ← canonical extension example
│   └── supervisor_agent.py
│
├── contrib/          Optional integrations (fastapi, agui, a2a, mcp_server)
└── evals/            Evaluation framework

Full walkthrough: docs/architecture/overview.md.

Extension points

Extension Mechanism
New agent Implement build_graph(checkpointer, store) and register(...) it
New tool registry.register(ToolCapability(...))
New command dispatcher.register("/foo", handler)
New prompt section sections.register(PromptSection(...))
New context provider Implement the ContextProvider protocol
New middleware Subclass _AgentMiddleware
New skill Add a SKILL.md file
MCP tools Configure AgentConfig.mcp_servers
Python plugins Drop a .py with contribute() into plugins_dir

Documentation

Full docs are rendered from docs/ via MkDocs — start at docs/index.md. Highlights:

Development

git clone https://github.com/allada-homelab/langgraph-kit
cd langgraph-kit
uv sync --extra dev

# Standard loop
just test        # pytest
just lint        # ruff check + codespell
just fmt         # ruff format
just typecheck   # basedpyright
just pre-commit  # all of the above
just build       # hatchling sdist + wheel

Integration testing with a generated app

The test app is generated from python-template via Copier:

uv tool install copier
bash scripts/setup-testapp.sh
cd testapp && uv run pytest backend/

See CONTRIBUTING.md for the contribution workflow.

License

AGPL-3.0-or-later. If that's a problem for commercial use, open an issue to discuss licensing.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langgraph_kit-0.9.7.tar.gz (152.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langgraph_kit-0.9.7-py3-none-any.whl (205.5 kB view details)

Uploaded Python 3

File details

Details for the file langgraph_kit-0.9.7.tar.gz.

File metadata

  • Download URL: langgraph_kit-0.9.7.tar.gz
  • Upload date:
  • Size: 152.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for langgraph_kit-0.9.7.tar.gz
Algorithm Hash digest
SHA256 4ecf427e00cd72ba7f53718979a76466c989929bee06f179fe0a629dcd4c7cd3
MD5 163f90ae35f1233519dcac24889e5aca
BLAKE2b-256 d124f6f521d66a5e93436480d81683e46a43f2f72dde821b84f09ed697e1c7a5

See more details on using hashes here.

Provenance

The following attestation bundles were made for langgraph_kit-0.9.7.tar.gz:

Publisher: release.yml on allada-homelab/langgraph-kit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file langgraph_kit-0.9.7-py3-none-any.whl.

File metadata

  • Download URL: langgraph_kit-0.9.7-py3-none-any.whl
  • Upload date:
  • Size: 205.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for langgraph_kit-0.9.7-py3-none-any.whl
Algorithm Hash digest
SHA256 2509b3bd28e632b2566bcf71d9553f83a017454b8fba9030e5aaa108a0fa372d
MD5 40a39fb6ae4bd0aa943451bb22658755
BLAKE2b-256 27c37b5f005196af1c121ae5d7406485581053de13559d58716e20a4d3cd5360

See more details on using hashes here.

Provenance

The following attestation bundles were made for langgraph_kit-0.9.7-py3-none-any.whl:

Publisher: release.yml on allada-homelab/langgraph-kit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page