Declarative schema migrations for LangGraph state persistence (checkpointers and stores).
Project description
LangMigrate
Declarative schema migrations for LangGraph state persistence — Alembic for your checkpointers and stores.
LangGraph persists application state through checkpointers (Postgres, Redis, ...) so graphs
can pause, resume, and survive failures. But as your app evolves, the state schema
(TypedDict / Pydantic) changes — fields get added, removed, renamed, retyped. Old or
interrupted threads resumed on newer code then fail to deserialize or silently corrupt data.
LangMigrate fixes this with declarative, versioned migrations applied either:
- Proactively (batch) — an offline CLI that walks every checkpoint in the database and upgrades it, or
- Lazily (online) — a runtime interceptor that upgrades a thread on the fly the moment it is loaded, via a cascade of transformation functions.
from langmigrate import setup_langmigrate
saver = setup_langmigrate(base_saver, "migrations") # that's it — pass to your graph
Symptoms — do you need this?
You probably landed here after changing a LangGraph state schema and seeing an old or interrupted thread blow up on resume. If any of these look familiar, LangMigrate is for you:
-
pydantic_core._pydantic_core.ValidationError: 1 validation error for AgentState—Field required [type=missing, ...]when a checkpoint saved before you added a required field is loaded back into the new schema. The real traceback looks like this:File ".../langgraph/pregel/_algo.py", line 1386, in _proc_input val = proc.mapper(val) File ".../langgraph/graph/state.py", line 1732, in _coerce_state return schema(**input) File ".../pydantic/main.py", line 263, in __init__ validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self) pydantic_core._pydantic_core.ValidationError: 1 validation error for AgentState user_id Field required [type=missing, input_value={'messages': ['resume me']}, input_type=dict] For further information visit https://errors.pydantic.dev/2.13/v/missing Before task with name 'respond' and path '('__pregel_pull', 'respond')'LangGraph rebuilds your Pydantic state from the persisted channels (
_coerce_state -> schema(**input)); a field added after the checkpoint was written is simply absent, so validation fails on resume. -
KeyError: '<field>'raised inside a node that reads a field which was renamed or removed, on a thread persisted under the old schema. With aTypedDictstate and a renamed field, the resume fails right inside your node:File ".../langgraph/pregel/_retry.py", line 617, in run_with_retry return task.proc.invoke(task.input, config) File ".../langgraph/_internal/_runnable.py", line 426, in invoke ret = self.func(*args, **kwargs) File "my_app/nodes.py", line 11, in respond last = state["messages"][-1] KeyError: 'messages' During task with name 'respond' and id '20014471-d5c7-1d58-2709-466e4bba78c2'The old thread persisted the field under its previous name (
msgs), sostate["messages"]isn't there on resume. -
langgraph.errors.InvalidUpdateError/EmptyChannelErrorafter a channel (state key) changed shape or type between deploys. -
Old checkpoints fail to deserialize with
JsonPlusSerializer/ msgpack after aTypedDictor Pydantic state model changed (added, dropped, renamed, or retyped fields). -
Long-term memory items (
BaseStore) break too —KeyError/TypeErrorinside a node reading a cross-thread memory item (store.get(...)/store.search(...)) whose value was saved under an old shape (e.g. flat{"name": ...}where the new code expects nested{"profile": {...}}):File ".../langgraph/pregel/_retry.py", line 617, in run_with_retry return task.proc.invoke(task.input, config) File ".../langgraph/_internal/_runnable.py", line 426, in invoke ret = self.func(*args, **kwargs) File "my_app/nodes.py", line 14, in respond name = item.value["profile"]["name"] KeyError: 'profile' During task with name 'respond' and id '56c4b765-6d5c-021a-5351-ede94b08ecb2'Store items outlive any single thread, so one schema change breaks every thread that reads the shared item — including brand-new ones, which makes it look like a random regression rather than a persistence problem. Checkpoint fixes don't help here; LangMigrate's
MigrationStorewrapper migrates items on read (and heals them in place onget()). -
Resuming an interrupted thread after a graph refactor silently loses work — the scariest variant, because there is no exception. A thread paused mid-node (e.g. on a human-in-the-loop
interrupt()) is resumed on code where that node was renamed or removed; LangGraph can't reattach the pending task, so the in-flight decision is dropped and the resumed run returns stale state. No stack trace, no log line — justlanggraph interrupt resume not working/ silent state corruption after a deploy (topology drift). -
"It worked before the deploy" — Postgres/Redis checkpointer threads created on an older schema crash, silently lose data, or corrupt state on the new code.
These are all the same root cause: a LangGraph checkpointer or store persisted state under an old schema, and your new code can't read it. LangMigrate versions and migrates that state the way Alembic does for SQL — see below. Every symptom above is reproducible (and fixable) hands-on in the runnable examples.
How it works
LangMigrate borrows the model that solved this exact problem for SQL databases — Alembic:
- Revisions. Every schema change is a small, pure, idempotent function pair
(
upgrade/downgrade) identified by a revision id and chained throughdown_revision— a DAG that supports branching and merge revisions. - A version tag on the persisted state. Each checkpoint carries its revision id in
checkpoint.metadata["langmigrate_rev"]— metadata, never application state, and queryable at the DB level (setup()creates an expression index for it). Store items carry the same tag insideItem.value, injected on write and stripped from every read, so neither your code nor your migrations ever see it. - An engine that closes the gap. When stored tag ≠ code head, the engine resolves a path through the DAG (deterministic topological linearization) and applies the upgrade cascade.
That engine runs through two delivery paths — use either or both:
ONLINE (lazy) OFFLINE (proactive batch)
graph.invoke / resume $ langmigrate upgrade head
│ │
▼ ▼
MigrationInterceptor.get_tuple() adapter enumerates stale
├─ read checkpoint checkpoints (indexed query,
├─ tag ≠ head? → run upgrade cascade keyset-paginated)
├─ write back healed state │
│ (idempotent: same checkpoint id, ▼
│ parent chain intact) engine migrates each one
▼ and writes it back
your node sees the new schema
The same pair exists for stores: MigrationStore (lazy, heals on get()) and
langmigrate store upgrade (batch). History enumeration (list/search) migrates
in memory only — no write storms, no rewriting of past checkpoints.
Installation
pip install langmigrate # core (CLI + runtime, no DB drivers)
pip install "langmigrate[postgres]" # + Postgres adapter
pip install "langmigrate[redis]" # + Redis adapter
pip install "langmigrate[langchain]" # + SchemaMigrationMiddleware (langchain 1.x agents)
Python 3.10–3.13. The core has no database dependencies — drivers are optional extras.
Quickstart
Initialize once, then write a revision per schema change:
langmigrate init
langmigrate revision -m "add context field"
# or let LangMigrate diff your state schema and fill the body for you:
langmigrate revision -m "add context field" \
--autogenerate --schema myapp.state:AgentState
A revision is a function pair — no subclassing required:
from langmigrate import migration
@migration("a1c0", down_revision=None, slug="add_context")
def add_context(state):
return state.add_field("context", factory=dict)
@add_context.reverse
def _(state):
return state.drop_field("context")
(The classic class Migration(BaseMigration) style still works and is what
langmigrate revision scaffolds. Declarative helpers: add_field, drop_field,
rename_field, coerce_field, require_field, plus remap_node for topology repair.)
Lazy online migration wraps your existing saver. setup_langmigrate is the
one-liner that builds the registry, engine and interceptor for you:
from langmigrate import setup_langmigrate
saver = setup_langmigrate(base_saver, "migrations") # write-back on by default
graph = builder.compile(checkpointer=saver)
...or wire it by hand for full control
from langmigrate import MigrationInterceptor, MigrationEngine, MigrationRegistry
engine = MigrationEngine(MigrationRegistry.from_path("migrations"))
saver = MigrationInterceptor(base_saver, engine, write_back=True)
Cure the whole database proactively before (or instead of) lazy healing:
langmigrate upgrade head # batch-upgrade every stale checkpoint
langmigrate upgrade head --online-dry-run # validate the full cascade without writing
langmigrate current --db # revision distribution across the DB
CLI at a glance
| Command | What it does |
|---|---|
langmigrate init [--with-store] |
Scaffold config + migrations directory (and store migrations) |
langmigrate revision -m "..." |
Create a new revision (--autogenerate --schema mod:Class diffs your state model) |
langmigrate merge -m "..." |
Join branched heads with a merge revision (multi-parent DAG) |
langmigrate history |
Show the revision DAG |
langmigrate current [--db] |
Show code head, or revision distribution in the database |
langmigrate check |
Validate the registry (broken pointers, unreachable heads) |
langmigrate upgrade <rev> |
Batch-upgrade stale checkpoints (--online-dry-run, --continue-on-error) |
langmigrate downgrade <rev> |
Batch-downgrade (--dry-run; irreversible migrations raise) |
langmigrate stamp <rev> |
Tag checkpoints without running migrations |
langmigrate store <verb> |
The same verbs (revision, history, current, check, upgrade, downgrade, stamp) for store items |
Integration paths
| Your situation | Use | Where |
|---|---|---|
| You own the checkpointer (Postgres/Redis/custom saver) | MigrationInterceptor via setup_langmigrate |
Quickstart |
| Managed platform (LangGraph Server / Cloud / Studio) — no saver access | SchemaMigrationMiddleware or a migrate_state_update node |
docs/INTEGRATION.md |
Cross-thread memory (BaseStore items) |
MigrationStore via setup_langmigrate_store |
below |
| Pre-release bulk cure of the whole DB | langmigrate upgrade / langmigrate store upgrade |
CLI |
Don't own the checkpointer (e.g. LangGraph Server)? Migrate at the state level with the middleware shim instead:
from langmigrate.integrations.langchain import SchemaMigrationMiddleware
agent = create_agent(model, middleware=[SchemaMigrationMiddleware("migrations"), ...])
Long-term memory (BaseStore) items evolve too. Store migrations live in their own directory and the wrapper is symmetric to the checkpointer one:
from langmigrate import setup_langmigrate_store
store = setup_langmigrate_store(base_store, "store_migrations")
# pass `store` to your compiled LangGraph as the store
langmigrate init --with-store
langmigrate store revision -m "add kind field"
langmigrate store upgrade head # proactive batch (Postgres)
Compatibility matrix
| Change | Safety | Strategy |
|---|---|---|
| Add field with default | Safe | lazy default injection |
| Remove unused field | Safe | payload cleanup |
| Rename field | Unsafe | dynamic key remap |
| Change field type | Unsafe | registered coercion function |
| Add required field (no default) | Unsafe | block with structured error or fallback hook |
| Interrupted thread on deleted/renamed node | Unsafe | NodeRemap helper applied within a migration |
Architecture
Clean Architecture, strictly enforced: migration business logic is pure Python with zero database dependencies — DB drivers only ever appear in adapters, as optional extras.
┌────────────┐ ┌────────────┐ ┌────────────┐
│ cli/ │ │ runtime/ │ │ adapters/ │
│ Typer app │ │ Interceptor│ │ Postgres, │
│ (batch) │ │ + Store │ │ Redis │
└─────┬──────┘ └─────┬──────┘ └─────┬──────┘
│ │ │ (DB clients live only here,
└────────────────┼────────────────┘ imported lazily)
▼
┌─────────────────────────────────────────────┐
│ core/ │
│ types · operations · migration · registry │
│ engine · version · topology │
│ pure: no I/O, no DB drivers │
└─────────────────────────────────────────────┘
Key design decisions (full rationale in CLAUDE.md):
- Alembic-style revision DAG (
revision+down_revision, tuple parents for merges) with deterministic path resolution. - Version tag in
checkpoint.metadata(langmigrate_rev) — queryable at the DB level (indexed bysetup()), never polluting application state. Store items keep the tag insideItem.value, invisible to application code. - Idempotent lazy write-back, on by default and disableable: re-persisting a migrated
checkpoint never changes
checkpoint["id"]nor breaks theparent_configchain. - Rollback safety: unknown stored revisions (code rolled back after a lazy migration)
are governed by the
on_unknown_revisionpolicy (raise/warn/pass). - Every migration is pure and idempotent; every
upgradehas adowngrade(or explicitly raisesIrreversibleMigrationError).
Runnable examples
The examples/ directory has end-to-end demos of every integration path —
all runnable with the in-memory saver, no Docker required (unless noted) — plus a
decision tree to pick the right one:
| Example | Pattern |
|---|---|
quickstart |
Online lazy in one line (setup_langmigrate), mypy --strict-clean |
evolving_agent |
Interceptor + write-back baseline: add / rename / coerce |
middleware_agent |
Managed platform: SchemaMigrationMiddleware, migrate node |
multi_tool_agent |
3-revision cascade on a StateGraph |
deep_research_agent |
NodeRemap, irreversible migrations, staged partial upgrade |
batch_migration |
Offline batch: upgrade / downgrade / dry-run |
studio |
LangGraph Studio: break threads & store items live, then heal them |
If you want to see the failure before fixing it, start with the
LangGraph Studio walkthrough: a real langgraph.json
project where you break checkpointed threads and shared store items live in Studio
(ValidationError on resume after adding a required field, KeyError after a rename,
a store item stuck on the old value shape) and then heal each one with the migrate node,
SchemaMigrationMiddleware, or MigrationStore.
Documentation
- Integration guide — saver-level vs state-level paths, topology repair, authoring migrations, a worked LangGraph Server + deepagents example.
- Docs site — rendered documentation + cookbook.
- CHANGELOG — release notes.
- CLAUDE.md — architecture and contribution conventions.
Status
Stable (1.1). Postgres and Redis adapters are implemented for both the proactive batch
and lazy online paths; 1.1 adds merge revisions (multi-parent DAG), LangGraph store
migrations (MigrationStore + langmigrate store), an async batch path, batch error
tolerance (--continue-on-error), a validating dry-run, and an on_unknown_revision
policy for rollback safety. The CLI, the runtime interceptors, and the state-level
middleware are covered by unit and integration tests on every supported Python version
(3.10–3.13). See SECURITY.md for vulnerability reporting.
Contributing
git clone https://github.com/scinfu/langmigrate && cd langmigrate
uv sync --extra dev --extra postgres --extra redis --extra langchain
uv run pytest # unit tests
docker compose up -d && uv run pytest -m integration # integration tests
uv run ruff check . && uv run ruff format . # lint + format
Conventions live in CLAUDE.md and CONTRIBUTING.md. Issues and PRs welcome: https://github.com/scinfu/langmigrate/issues.
License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file langmigrate-1.2.0.tar.gz.
File metadata
- Download URL: langmigrate-1.2.0.tar.gz
- Upload date:
- Size: 132.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0c30fb894216c40c46676f85a859bf336421b501f1c40dcc5cc4fad02a1f1ba2
|
|
| MD5 |
37df5cf790fb658ab56d578a4c9fd528
|
|
| BLAKE2b-256 |
3389cfe6952e8eb77d59520470dc2d22d89be011590adc0c9821107586a1d0c2
|
Provenance
The following attestation bundles were made for langmigrate-1.2.0.tar.gz:
Publisher:
publish.yml on scinfu/langmigrate
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
langmigrate-1.2.0.tar.gz -
Subject digest:
0c30fb894216c40c46676f85a859bf336421b501f1c40dcc5cc4fad02a1f1ba2 - Sigstore transparency entry: 1801489143
- Sigstore integration time:
-
Permalink:
scinfu/langmigrate@a684752d7f94f70087c276ad5f5759b5e0359df9 -
Branch / Tag:
refs/tags/v1.2.0 - Owner: https://github.com/scinfu
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@a684752d7f94f70087c276ad5f5759b5e0359df9 -
Trigger Event:
push
-
Statement type:
File details
Details for the file langmigrate-1.2.0-py3-none-any.whl.
File metadata
- Download URL: langmigrate-1.2.0-py3-none-any.whl
- Upload date:
- Size: 67.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
daa80dc27dd28d907f7cbbd1b670b07039e48b9c224f418801af4dcbc5cf3b5b
|
|
| MD5 |
2a68873d22235545dc08b7774918d932
|
|
| BLAKE2b-256 |
72c90710710c09cf04e72bcd4cb38a3378ac334beedae8a54687a498a8ff9326
|
Provenance
The following attestation bundles were made for langmigrate-1.2.0-py3-none-any.whl:
Publisher:
publish.yml on scinfu/langmigrate
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
langmigrate-1.2.0-py3-none-any.whl -
Subject digest:
daa80dc27dd28d907f7cbbd1b670b07039e48b9c224f418801af4dcbc5cf3b5b - Sigstore transparency entry: 1801489409
- Sigstore integration time:
-
Permalink:
scinfu/langmigrate@a684752d7f94f70087c276ad5f5759b5e0359df9 -
Branch / Tag:
refs/tags/v1.2.0 - Owner: https://github.com/scinfu
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@a684752d7f94f70087c276ad5f5759b5e0359df9 -
Trigger Event:
push
-
Statement type: