One binding deadline for every inner timeout in a LangGraph node — clamp inner budgets to the cooperative deadline so work salvages instead of getting killed by the watchdog.
Project description
langgraph-node-deadline
One binding deadline for every inner timeout in a LangGraph node. Clamp inner budgets to the node's cooperative deadline so heavy work salvages a partial result instead of getting hard-killed by the watchdog and discarding everything.
Zero runtime dependencies. ~120 lines. Python 3.9+.
pip install langgraph-node-deadline
The problem
A LangGraph node that does real work has several layers each re-deriving their
own clock: an outer TimeoutPolicy watchdog, an inner agent/tool budget, a
retry loop, a sub-planner that "wants" 60 seconds. When those clocks disagree,
the inner layers happily dispatch work the outer watchdog is guaranteed to
kill — and the kill is uncooperative. It cancels the node and throws away
everything, including the partial answer you could have returned.
You've seen the symptom: a long run times out into nothing after burning minutes of paid LLM calls, and the user just sees "it failed." The upstream issue is real and open: langchain-ai/langgraph#5672 — Run Cancellation Causes Loss of Streamed State Not Yet Persisted.
The trap, distilled: if your cooperative cancel and the watchdog are pinned to the same number, the watchdog clock starts at node entry — before your code runs — so your cancel loses the race deterministically. Equal timeouts lose.
The fix
Set one deadline at node entry. Make every inner timeout clamp to it
instead of re-deriving its own. Now inner calls yield at the node boundary, with
a little grace, before the watchdog fires — so your try/except actually
runs and you return a complete-but-shorter answer.
import asyncio
from langgraph_node_deadline import node_deadline_in, cooperative_wait_for
async def my_node(state):
# this node gets ~1.8s of cooperative runtime (a hair under its watchdog)
with node_deadline_in(1.8):
try:
# the planner asks for 5s, but gets clamped to what's actually left
result = await cooperative_wait_for(plan_and_write(state), budget_secs=5.0)
return {"draft": result}
except asyncio.TimeoutError:
# runs BEFORE the watchdog can kill us — keep the partial work
return {"draft": salvage_partial(state)}
See it lose vs. salvage (30 seconds, no LangGraph needed)
python examples/salvage_demo.py
Outer watchdog (LangGraph TimeoutPolicy): 2.0s | inner planner wants ~5s
NAIVE (inner ignores the node deadline)
-> LOST in 2.00s — outer watchdog cancelled the node, salvage code never ran, ALL work discarded
CLAMPED (inner clamps to the node deadline)
-> SALVAGED in 1.80s — kept 3 steps: ['step 1', 'step 2', 'step 3']
Same work, same watchdog. One import decides whether you keep anything.
Wiring it into a real LangGraph node
Set the scope to a hair under whatever cap the executor enforces, then clamp every inner timed call through it:
from langgraph_node_deadline import node_deadline_in, clamp_to_node_deadline, cooperative_wait_for
NODE_CAP_SECS = 30.0 # match this to your TimeoutPolicy, minus a small grace
async def research_node(state):
with node_deadline_in(NODE_CAP_SECS - 1.0): # leave 1s of grace under the watchdog
# an inner retry loop, sub-agent, or tool call — all clamp to the same deadline
per_call = clamp_to_node_deadline(15.0, reserve_secs=2.0) # reserve finalize headroom
chunks = await cooperative_wait_for(retrieve(state), budget_secs=per_call)
return {"chunks": chunks}
Because the deadline lives in a contextvars.ContextVar, and asyncio copies
the ambient context when it creates a task, the scope you open before you
await is visible to the agent task and every subagent task it spawns — no
threading the deadline through call signatures.
API
| Symbol | What it does |
|---|---|
node_deadline_in(seconds) |
Context manager. Set the binding deadline to now + seconds. Use at node entry. |
node_deadline_scope(deadline_monotonic) |
Context manager. Set the deadline to an absolute time.monotonic() timestamp (or None to clear). node_deadline is an alias. |
clamp_to_node_deadline(budget_secs, *, reserve_secs=0.0) |
The core primitive. Returns min(budget_secs, remaining - reserve_secs), floored at 0. Returns budget_secs unchanged when no scope is active. |
cooperative_wait_for(awaitable, budget_secs, *, reserve_secs=0.0) |
asyncio.wait_for that never outlasts the node deadline. Raises asyncio.TimeoutError on the clamped budget. |
get_node_deadline_remaining_secs() |
Seconds left, or None if no scope. Never negative. |
node_deadline_exceeded() |
True only when a scope is active and its deadline has passed. Safe loop guard. |
Fail-open by design. With no active scope, every function behaves as if it weren't there — so adding it to one node never changes the behavior of the rest of your graph, your tests, or direct invocations.
Why a whole package for ~120 lines
Because the lesson is the hard part, not the code. This is the
derive-don't-pin
discipline extracted from a production agent that paid for it: a synthesis pool
that believed it had 43.5 seconds left nine seconds before the watchdog killed
the node — because four inner layers each trusted their own clock and none knew
the one the executor was actually enforcing. One binding deadline fixes the
entire class of bug.
License
MIT © 2026 Fred Becker. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file langgraph_node_deadline-0.1.0.tar.gz.
File metadata
- Download URL: langgraph_node_deadline-0.1.0.tar.gz
- Upload date:
- Size: 9.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e211f360f1663eea22888624d8a96a0bab2282c9406b99a566c0162981778159
|
|
| MD5 |
b3be5b70ddb255aaf69a6583280c77ec
|
|
| BLAKE2b-256 |
f33a4b2e1d300133bc865b1d0df02c0def4ad795e2de2154f9b37290349dad73
|
File details
Details for the file langgraph_node_deadline-0.1.0-py3-none-any.whl.
File metadata
- Download URL: langgraph_node_deadline-0.1.0-py3-none-any.whl
- Upload date:
- Size: 7.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1eebf0216eb0d0a57c3696a05991dd47143b724818f01495bf2cdc4e48e7e045
|
|
| MD5 |
341254128e93334daa9fa09374cfdd61
|
|
| BLAKE2b-256 |
d35bbf7e7fedf2f7f6a1da623c302c8d39d3f0afc425d37f09697ae68d0e8147
|