Scratchpad compression middleware for Autourgos agents — prevents token overflow by auto-summarizing long reasoning chains.
Project description
autourgos-summarizer
Scratchpad compression middleware for Autourgos agents.
Automatically summarizes long reasoning chains so your agent never runs out of token space — even on tasks with dozens of iterations.
Why use this?
LLM agents accumulate a scratchpad — a growing log of thoughts, tool calls, and observations. On long tasks this scratchpad can:
- Hit the LLM's context window limit and crash
- Slow down responses (more tokens = more cost and latency)
- Confuse the LLM with too much irrelevant history
AutoSummarizeMiddleware solves this by compressing the scratchpad in the background every N iterations (or when it exceeds a size limit), keeping only what matters: key findings, tool results, and current status.
Install
pip install autourgos-summarizer
Zero dependencies. Works with any Autourgos agent.
Quick Start
from autourgos_summarizer import AutoSummarizeMiddleware
from autourgos_react_agent import ReactAgent
summarizer = AutoSummarizeMiddleware(
summarize_every=5, # compress every 5 iterations
max_scratchpad_chars=15000, # also compress if scratchpad exceeds 15k chars
)
agent = ReactAgent(llm=my_llm, middleware=[summarizer])
result = agent.invoke("Research the latest breakthroughs in quantum computing")
print(result)
How it works
AutoSummarizeMiddleware hooks into on_iteration_start. Before each iteration it checks:
- Is this a multiple of
summarize_every? (e.g. iteration 5, 10, 15…) - Is the scratchpad longer than
max_scratchpad_chars?
If either is true, it calls the LLM (middleware's own llm if set, otherwise agent.llm) with a compression prompt that asks it to distill the scratchpad into:
[Summary of steps 1-N]
Key findings: ...
Tool results: ...
Current status: ...
This summary replaces the full scratchpad before the next LLM call. The agent continues from where it left off, but with a much smaller context.
Summarization runs synchronously (the agent loop is paused at this point), and a threading.Lock prevents duplicate summarizations when parallel tool callbacks fire.
Use a dedicated LLM for summarization
You can pass a separate llm to AutoSummarizeMiddleware. This LLM is used only for compressing the scratchpad — great for keeping costs low by using a cheaper/faster model just for this job.
from autourgos_summarizer import AutoSummarizeMiddleware
from autourgos_openaichat import OpenAIChatModel
# Main agent uses a powerful model
main_llm = OpenAIChatModel(model="gpt-4o")
# Summarizer uses a cheap fast model
cheap_llm = OpenAIChatModel(model="gpt-4o-mini")
summarizer = AutoSummarizeMiddleware(
summarize_every=5,
llm=cheap_llm, # overrides agent.llm for summarization
)
agent = ReactAgent(llm=main_llm, middleware=[summarizer])
result = agent.invoke("Research the latest breakthroughs in quantum computing")
print(result)
If llm is not provided, it falls back to agent.llm automatically.
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
summarize_every |
int | None |
5 |
Summarize every N iterations. None = disable iteration-based trigger. |
max_scratchpad_chars |
int |
15000 |
Also trigger if scratchpad exceeds this many characters. |
llm |
any | None |
LLM for summarization. Needs .invoke(prompt). Falls back to agent.llm if not set. |
Combine with other middleware
from autourgos_summarizer import AutoSummarizeMiddleware
from autourgos_history import AgentHistoryMiddleware
history = AgentHistoryMiddleware()
summarizer = AutoSummarizeMiddleware(summarize_every=5)
agent = ReactAgent(llm=my_llm, middleware=[summarizer, history])
Requirements
- Python 3.9+
- Any Autourgos agent that exposes
agent.scratchpad,agent.llm, andagent.query
License
MIT — see LICENSE
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file autourgos_summarizer-1.0.1.tar.gz.
File metadata
- Download URL: autourgos_summarizer-1.0.1.tar.gz
- Upload date:
- Size: 5.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0ebd3690e9c3f065d6620029af5355d8fa7a7d5834d55482fa7074cb16173f01
|
|
| MD5 |
6441c8f95824971e5faafea4d6caa7e7
|
|
| BLAKE2b-256 |
37065a6f9eb884a15263e58ad0f9d2029d0316652c200f09abf59585daa18037
|
File details
Details for the file autourgos_summarizer-1.0.1-py3-none-any.whl.
File metadata
- Download URL: autourgos_summarizer-1.0.1-py3-none-any.whl
- Upload date:
- Size: 6.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8a5da41d158f953808311d5010788ac30786ca98c522a5bf0ffbae1d69fd9c12
|
|
| MD5 |
a852faaf463fc6de09a90082f3df8d06
|
|
| BLAKE2b-256 |
89b12be80d160d389782d7c231988054c41dba3c1b82c51bc921647ab2b82afc
|