JavaScript REPL middleware for deepagents
Project description
langchain-quickjs
A deepagents middleware that gives an agent a persistent, sandboxed JavaScript REPL tool, backed by quickjs-rs (QuickJS embedded via PyO3 + rquickjs).
Instead of issuing N serial tool calls, the model can write one block of JavaScript that orchestrates work in-loop — variables and functions defined in one call survive into the next, Promise.all runs concurrent work, and (opt-in) agent tools are callable from inside the REPL as await tools.<name>(...).
from deepagents import create_deep_agent
from langchain_quickjs import REPLMiddleware
agent = create_deep_agent(
model="claude-sonnet-4-6",
middleware=[REPLMiddleware()],
)
- Why
- Install
- Quick start
- What the REPL is
- Programmatic tool calling (PTC)
- Skills: importable JS/TS modules
- Configuration reference
- Errors the model can see
- License
Why
Tool calling is fine for small, discrete requests. It falls apart when the model needs to:
- loop over a list and call a tool per item
- run two independent tool calls concurrently
- compute something between calls (aggregate, filter, dedupe, format)
- reuse intermediate state across several turns
Each of those currently costs one round-trip to the model per step. With a REPL, all of it happens in one eval call. This enables programmatic tool calling where the model writes JavaScript that invokes the agent's own tools.
Install
uv add langchain-quickjs
langchain-quickjs depends on quickjs-rs, a PyO3 extension module that ships prebuilt wheels for macOS, Linux, and Windows on CPython 3.11+.
Quick start
from deepagents import create_deep_agent
from langchain_quickjs import REPLMiddleware
agent = create_deep_agent(
model="claude-sonnet-4-6",
middleware=[REPLMiddleware()],
)
# Use `ainvoke` — PTC bridges register as async QuickJS host functions,
# and sync `invoke` on a REPL with async bridges raises ConcurrentEvalError.
result = await agent.ainvoke({"messages": [{"role": "user", "content": "..."}]})
The middleware:
- registers an
evaltool (configurable name) that runs JS in a persistent context; - appends a short system-prompt snippet explaining the tool's semantics (sandbox, timeout, memory limit);
- gives every LangGraph
thread_idits own QuickJSRuntime, so two conversations can't see each other's globals.
What the REPL is
Persistence
The REPL is module-flavoured: top-level let/const/function persist across eval calls in the same thread. By default (snapshot_between_turns=True), state also persists across turns in the same LangGraph thread_id by snapshotting after each run and restoring before the next.
Set snapshot_between_turns=False to reset REPL state after each turn.
Snapshot payloads are capped by max_snapshot_bytes (defaults to memory_limit); oversized snapshots are dropped instead of persisted.
// call 1
const fib = (n) => (n < 2 ? n : fib(n - 1) + fib(n - 2));
// call 2
fib(10) // 55
Sandbox
The REPL runs in a QuickJS context with no ambient capabilities. There is no filesystem, no network, no fetch, no require, no real clock (Date.now() is whatever QuickJS provides, not wall-clock for security-sensitive uses), no process, no import of anything you didn't explicitly install.
Escape hatches, if you want them, go through explicit middleware:
- PTC — to call into the agent's own tools (see below).
- Skills — to pre-install JS/TS modules the agent can
import.
Console capture
console.log / console.warn / console.error are captured by default and returned as a <stdout> block alongside the result, separately truncated. Disable with capture_console=False if you'd rather the guest see no console at all.
console.log("hi", 2);
1 + 1
<stdout>
hi 2
</stdout>
<result>2</result>
Timeouts and memory
Each call has a per-call wall-clock timeout (default 5 s). Breaching it produces:
<error type="Timeout">...</error>
The runtime has a shared memory limit across every context under it (default 64 MiB). OOM surfaces as:
<error type="OutOfMemory">...</error>
PTC host-function calls are also budgeted per eval call (default 256 tools.*
invocations). Exceeding the budget surfaces as:
<error type="PTCCallBudgetExceeded">...</error>
Set max_ptc_calls=None only in trusted environments. Disabling the
budget allows unbounded PTC-call loops and increases DoS risk.
Top-level await works on the async path — the promise settles before the call returns. An un-resolvable top-level promise (no host work in flight, no resolver) surfaces as <error type="Deadlock">.
Result formatting
Every eval renders into one wire format consumed by the model:
| Outcome | Rendered as |
|---|---|
| Marshalable value | <result>{json-ish}</result> |
| Function or unmarshalable | <result kind="handle">[Function] arity=2</result> |
| JS-level throw | <error type="TypeError">{message}\n{stack}</error> |
| Timeout / deadlock / OOM | <error type="Timeout" | "Deadlock" | "OutOfMemory">...</error> |
console.* output |
separate <stdout>...</stdout> block |
Results and stdout are independently truncated to max_result_chars (default 4000) before being sent back to the model.
Numeric rendering follows Node's REPL convention — whole-valued floats (42.0) render as integers (42) so the model isn't confused by JS's single numeric type.
Programmatic tool calling (PTC)
PTC is the reason to use this middleware over a plain code-interpreter tool. When configured, each exposed tool is available inside the REPL as:
async tools.<camelCaseName>(input: {...}): Promise<string>
So an agent with a search_web tool and a summarize tool can do:
const results = await Promise.all([
tools.searchWeb({ query: "deepagents" }),
tools.searchWeb({ query: "quickjs" }),
]);
await tools.summarize({ text: results.join("\n\n") })
...in one eval call — three tool invocations, zero round-trips to the model between them.
Enabling it
REPLMiddleware() # disabled (default)
REPLMiddleware(ptc=["search_web"]) # explicit allowlist
REPLMiddleware(ptc=[search_tool]) # explicit tool object allowlist
The REPL's own tool is always excluded from PTC; tools.eval("tools.eval(...)") would be pointless recursion, and if the model wants nested code it can just write nested code in one call.
What the model sees
When PTC is on, the system-prompt snippet grows an API Reference — tools namespace section listing every exposed tool as a TypeScript-ish signature derived from the tool's args schema:
/** Search the web for the given query. */
async tools.searchWeb(input: {
/** The query string. */
query: string;
/** Max results. */
limit?: number;
}): Promise<string>
Enums, anyOf unions, nested objects, and arrays are all supported by the schema renderer. Opaque types fall back to Record<string, unknown> — the description is usually enough.
How it works (so you can debug it)
- Each PTC-exposed tool gets a QuickJS host-function bridge registered under a generated
__tools_*global symbol. The bridge is async, so the guest seestools.x(...)as returning aPromise. globalThis.toolsis rebuilt every turn from the currently-exposed name set. So if an upstream middleware filters tools on a per-turn basis, thetoolsnamespace follows along.- When the bridge invokes a tool, it forwards the
ToolRuntimecaptured from the outerevalcall — so subagent tools liketasksee graphstate,store,context, and a synthesised childtool_call_id. - Tool return values are coerced to strings: strings pass through,
ToolMessages get unwrapped, aCommandhas its last-message content extracted, everything else getsjson.dumps'd.
Skills: importable JS/TS modules
If your agent uses SkillsMiddleware (from deepagents), any skill whose frontmatter includes a module: key becomes dynamically importable inside the REPL:
const helpers = await import("@/skills/my-helpers");
helpers.greet("world")
Under the hood:
- At eval time, the middleware scans the source for literal
"@/skills/<name>"specifiers. - For each referenced skill, it fetches the skill directory through your
BackendProtocol, packages every typescript file into a module scope, and installs it under the bare specifier. - Installs are cached per-
Runtime— each skill loads at most once, and a broken skill is cached as an error so it doesn't re-hit the backend every eval. - If a skill referenced in source isn't available or fails to install, the eval call short-circuits with
<error type="SkillNotAvailable">...</error>— the model sees a clean failure instead of a guest-sideReferenceError. - Skills are isolated: one skill's scope can't bare-import another. Bundle shared code into each skill or re-export through a single skill.
Enable it by passing the same BackendProtocol your SkillsMiddleware uses:
REPLMiddleware(skills_backend=my_backend)
There's a hard cap of 1 MiB per skill bundle. If you hit it, split the skill or prune generated code.
Configuration reference
REPLMiddleware(
memory_limit=64 * 1024 * 1024, # bytes, shared across contexts
timeout=5.0, # per-call seconds
max_ptc_calls=256, # per-eval `tools.*` bridge calls, None disables (DoS risk)
tool_name="eval", # what the model calls it
max_result_chars=4000, # result/stdout truncation, each
capture_console=True, # install console.log/warn/error bridge
snapshot_between_turns=True, # snapshot in after_agent, restore in before_agent
max_snapshot_bytes=None, # defaults to `memory_limit`; larger snapshots are dropped
ptc=None, # None | list[str] | list[BaseTool]
skills_backend=None, # BackendProtocol for @/skills/<name> imports
)
Errors the model can see
| Type | Cause |
|---|---|
SyntaxError, TypeError, ReferenceError, ... |
User-code error. Re-surfaces the JS error name verbatim. |
Timeout |
Call exceeded timeout=. |
OutOfMemory |
Runtime hit memory_limit=. |
PTCCallBudgetExceeded |
Uncaught tools.* call-budget overflow in one eval (max_ptc_calls=). |
Deadlock |
Top-level promise never resolved with no async host work in flight. |
ConcurrentEval |
Shouldn't happen under locks; defensive mapping for QuickJS ConcurrentEvalError. |
SkillNotAvailable |
Source referenced @/skills/<name> we couldn't resolve or install. |
asyncio.CancelledError propagates out cleanly when JS declines to catch a HostCancellationError — so LangGraph cancellation semantics work end-to-end.
License
MIT. See LICENSE
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file langchain_quickjs-0.1.0.tar.gz.
File metadata
- Download URL: langchain_quickjs-0.1.0.tar.gz
- Upload date:
- Size: 202.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e3dde78cf1449761422c7d8de05608926789a5b49b30bc3723471249059d6dc0
|
|
| MD5 |
ebee97deb40610318108ea24dd7b8199
|
|
| BLAKE2b-256 |
f7deb4a24d1edb6cf41a9869c5e469882b40e7f82ca25f42e4ef2d3aee88941e
|
File details
Details for the file langchain_quickjs-0.1.0-py3-none-any.whl.
File metadata
- Download URL: langchain_quickjs-0.1.0-py3-none-any.whl
- Upload date:
- Size: 34.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4a32d1a163dda95567407467c2d5502df520fbbee300f75ab6f489c49eb6ad61
|
|
| MD5 |
f00901f047e28198b5e5d90bc37d29d9
|
|
| BLAKE2b-256 |
15d46b1eade2183d41631bb851d67e20f167425fb05ba644503603f8ef8a7691
|