LangChain integration for AgentLoop — auto-logs turns to the review queue and provides a memory injection Runnable for retrieval.
Project description
agentloop-py-langchain
LangChain integration for AgentLoop — adds memory retrieval before your LLM calls and turn logging after, with a single chain composition step and a single callback.
pip install agentloop-py agentloop-py-langchain
The install name is agentloop-py-langchain but you import from
agentloop_langchain (same pip install bs4 / from bs4 import ...
convention as the rest of the AgentLoop SDK).
What this gives you
A LangChain user dropping AgentLoop into an existing chain gets two things:
- Memory retrieval — before the LLM call, search past corrections for facts relevant to the user's question, and inject them into the prompt automatically. This is the part that makes your agent smarter immediately when reviewers correct it.
- Turn logging — after every LLM call, post the (question, response) pair to the AgentLoop review queue so reviewers can correct anything the agent got wrong.
Both halves work in sync chains (chain.invoke) and async chains
(chain.ainvoke).
Why two pieces and not one?
Other AgentLoop integrations (OpenAI, Anthropic) wrap the SDK call
itself, so a single wrap_openai(client, loop=...) does both halves
invisibly. LangChain doesn't allow that pattern. Its callback
system is observation-only — by the time on_llm_start fires, the
prompt is already finalized and being sent. We can't sneak retrieved
facts in.
So we split the work:
AgentLoopMemoryInjectorruns upstream in the chain (before the prompt template), where it can actually shape the LLM input.AgentLoopCallbackHandlerruns as a callback on the LLM, observing what was sent and what came back, then posting to AgentLoop.
The two-step setup is more code than the OpenAI wrapper, but it's still fewer than 5 lines on top of an existing LangChain chain. And it's honest about what LangChain's design lets us do — no fragile prompt mutation tricks that would break on minor LangChain version bumps.
The complete pattern
from agentloop import AgentLoop
from agentloop_langchain import (
AgentLoopMemoryInjector,
AgentLoopCallbackHandler,
)
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
loop = AgentLoop(api_key="ak_live_...")
# Step 1: a prompt template that has a slot for retrieved facts.
prompt = ChatPromptTemplate.from_messages([
("system",
"You are a helpful assistant.\n\n"
"Trusted facts from past corrections:\n{agentloop_memories}"),
("user", "{question}"),
])
# Step 2: the chain. Three pieces composed with `|`:
chain = (
# 2a. Retrieve relevant memories — runs first
AgentLoopMemoryInjector(loop=loop, query_field="question")
# 2b. Format the prompt with the retrieved facts
| prompt
# 2c. Call the LLM, with the callback that logs the turn afterwards
| ChatOpenAI(model="gpt-4o").with_config(
callbacks=[AgentLoopCallbackHandler(loop=loop)],
)
)
# Step 3: use the chain normally.
result = chain.invoke({"question": "What's the Pix limit at night?"})
That's the full pattern. Three changes to a stock LangChain chain:
- One extra step at the start (
AgentLoopMemoryInjector | ...) - One extra placeholder in the prompt (
{agentloop_memories}) - One callback on the LLM (
callbacks=[AgentLoopCallbackHandler(...)])
What actually happens, step-by-step
This is the part worth understanding because it's where the value lands.
Before the call: memory retrieval
When you chain.invoke({"question": "What's the Pix limit at night?"}):
-
The injector receives
{"question": "What's the Pix limit at night?"}. -
It calls
loop.search(query="What's the Pix limit at night?", limit=3). -
AgentLoop's backend semantically searches your org's memories (corrections reviewers have approved over time) and returns the most relevant matches. For example:
- "The Pix nighttime limit is R$1,000 between 8pm and 6am."
- "Pix limits reset at 6am, not midnight."
-
The injector adds those to the input as a formatted string under the
agentloop_memorieskey:{ "question": "What's the Pix limit at night?", "agentloop_memories": "- The Pix nighttime limit is R$1,000 between 8pm and 6am.\n- Pix limits reset at 6am, not midnight.", }
-
The prompt template renders, using
{agentloop_memories}as a slot, producing this final prompt for the LLM:system: "You are a helpful assistant. Trusted facts from past corrections: - The Pix nighttime limit is R$1,000 between 8pm and 6am. - Pix limits reset at 6am, not midnight." user: "What's the Pix limit at night?" -
The LLM sees this and answers correctly, even if its training data would have made it answer wrongly without those facts.
This is the magic. Every correction your reviewers make in the AgentLoop dashboard becomes a future-tense fact that gets injected into the next relevant prompt automatically. No fine-tuning, no retraining, no manual prompt engineering — it just works the next time someone asks a similar question.
After the call: turn logging
After the LLM responds:
- The callback handler (registered via
with_config(callbacks=[...])) receives the response in itson_llm_endhook. - It pulls the question (captured at
on_llm_start) and the response text out of the LLMResult. - It calls
loop.log_turn(question=..., agent_response=...)which posts the turn to the AgentLoop review queue. - Reviewers see the turn in the dashboard. If the LLM got it wrong, they correct it. The correction becomes a memory. The next time a similar question comes through, that memory gets retrieved in step 3 above. The loop closes.
Per-call options
Pass user-specific or call-specific data via LangChain's metadata:
chain.invoke(
{"question": "What's the Pix limit at night?"},
config={"metadata": {"agentloop": {
"user_id": "u_42",
"session_id": "sess_xyz",
"signals": {"thumbs_down": True},
"tags": ["pix", "limits"],
}}},
)
Recognized keys: user_id, session_id, signals (dict), tags
(list), metadata (dict, free-form), skip (bool — skip logging this
turn entirely).
You can also set defaults at construction:
AgentLoopCallbackHandler(loop=loop, user_id="u_42", tags=["beta"])
Per-call values override construction-time defaults.
Async chains
Use AsyncAgentLoopCallbackHandler and pass AsyncAgentLoop (from
agentloop.aio) to the injector:
from agentloop.aio import AsyncAgentLoop
from agentloop_langchain import (
AgentLoopMemoryInjector,
AsyncAgentLoopCallbackHandler,
)
async with AsyncAgentLoop(api_key="ak_...") as loop:
chain = (
AgentLoopMemoryInjector(loop=loop, query_field="question")
| prompt
| ChatOpenAI(model="gpt-4o").with_config(
callbacks=[AsyncAgentLoopCallbackHandler(loop=loop)],
)
)
result = await chain.ainvoke({"question": "..."})
If you accidentally pass a sync AgentLoop to either async component,
they'll detect that and dispatch the network calls to a thread-pool
executor so the event loop isn't blocked. Works either way; passing
an AsyncAgentLoop is just slightly more efficient.
Customization
Where memories appear in the prompt
The injector writes to input["agentloop_memories"] by default. Use
that key in your prompt template wherever you want the retrieved facts
to appear. To use a different key:
AgentLoopMemoryInjector(loop=loop, query_field="user_question",
output_field="known_facts")
How memories are formatted
The default formatter renders memories as a bulleted list of .fact
strings, or (none yet) when the search returns nothing. Override:
def my_formatter(memories):
if not memories:
return ""
return "Reference material:\n" + "\n".join(
f" • {m.fact}" for m in memories
)
AgentLoopMemoryInjector(loop=loop, query_field="q",
format_memories=my_formatter)
Filtering search by tags / user
AgentLoopMemoryInjector(
loop=loop,
query_field="question",
tags=["pix"], # only memories tagged "pix"
user_id_field="user_id", # search per-user when input has user_id
limit=5, # retrieve top 5 instead of default 3
)
Imperative usage (no LCEL)
If your code doesn't compose with |:
from agentloop_langchain import inject_memories
enriched = inject_memories({"question": q}, loop=loop)
# enriched["agentloop_memories"] is the formatted string
prompt_text = template.format(**enriched)
response = llm.invoke(prompt_text)
loop.log_turn(question=q, agent_response=response.content)
Failure mode
If AgentLoop is unreachable or returns an error:
- The injector returns the input unchanged with
agentloop_memoriesset to(none yet). The chain continues; the LLM just doesn't get retrieved facts for that call. - The callback handler swallows the error and logs a warning. The chain's response goes back to the user normally; that turn just doesn't end up in the review queue.
Both behaviors are deliberate. AgentLoop is a value-add layer; if it's having a bad day, your agent should still respond.
Compatibility
- Python 3.9+
langchain-core0.1.0+- Works with any LLM that has a LangChain integration (
langchain-openai,langchain-anthropic,langchain-google-genai,langchain-cohere, Ollama, vLLM, etc.) — we depend on the abstract callback interface, not any specific provider.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agentloop_py_langchain-0.2.0.tar.gz.
File metadata
- Download URL: agentloop_py_langchain-0.2.0.tar.gz
- Upload date:
- Size: 13.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
89f4ee88c89e3396ed26ade4d3a4956726ffcf54e9ed4dd23af86f1a54c6f998
|
|
| MD5 |
803e50868d4372a943dfe897d4148a0b
|
|
| BLAKE2b-256 |
cfec04bd2ca43bda7b9b40cacee12691908441f5ac0b740fa5cac2efd837efb3
|
File details
Details for the file agentloop_py_langchain-0.2.0-py3-none-any.whl.
File metadata
- Download URL: agentloop_py_langchain-0.2.0-py3-none-any.whl
- Upload date:
- Size: 14.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
978065bc46af5c0fd973d8a23004e59a900297da1aa34fbb729a01114782d819
|
|
| MD5 |
1082d60c95470e2b4137825ca8e37026
|
|
| BLAKE2b-256 |
e8d305bad00d63783e7471a6a53fe3acf35271838dfb7e804860a7d92035b922
|