Skip to main content

Scratchpad compression middleware for Autourgos agents — prevents token overflow by auto-summarizing long reasoning chains.

Project description

autourgos-summarizer

Scratchpad compression middleware for Autourgos agents.

Automatically summarizes long reasoning chains so your agent never runs out of token space — even on tasks with dozens of iterations.


Why use this?

LLM agents accumulate a scratchpad — a growing log of thoughts, tool calls, and observations. On long tasks this scratchpad can:

  • Hit the LLM's context window limit and crash
  • Slow down responses (more tokens = more cost and latency)
  • Confuse the LLM with too much irrelevant history

AutoSummarizeMiddleware solves this by compressing the scratchpad in the background every N iterations (or when it exceeds a size limit), keeping only what matters: key findings, tool results, and current status.


Install

pip install autourgos-summarizer

Zero dependencies. Works with any Autourgos agent.


Quick Start

from autourgos_summarizer import AutoSummarizeMiddleware
from autourgos_react_agent import ReactAgent

summarizer = AutoSummarizeMiddleware(
    summarize_every=5,          # compress every 5 iterations
    max_scratchpad_chars=15000, # also compress if scratchpad exceeds 15k chars
)

agent = ReactAgent(llm=my_llm, middleware=[summarizer])
result = agent.invoke("Research the latest breakthroughs in quantum computing")
print(result)

How it works

AutoSummarizeMiddleware hooks into on_iteration_start. Before each iteration it checks:

  1. Is this a multiple of summarize_every? (e.g. iteration 5, 10, 15…)
  2. Is the scratchpad longer than max_scratchpad_chars?

If either is true, it calls the LLM (middleware's own llm if set, otherwise agent.llm) with a compression prompt that asks it to distill the scratchpad into:

[Summary of steps 1-N]
Key findings: ...
Tool results: ...
Current status: ...

This summary replaces the full scratchpad before the next LLM call. The agent continues from where it left off, but with a much smaller context.

Summarization runs synchronously (the agent loop is paused at this point), and a threading.Lock prevents duplicate summarizations when parallel tool callbacks fire.


Use a dedicated LLM for summarization

You can pass a separate llm to AutoSummarizeMiddleware. This LLM is used only for compressing the scratchpad — great for keeping costs low by using a cheaper/faster model just for this job.

from autourgos_summarizer import AutoSummarizeMiddleware
from autourgos_openaichat import OpenAIChatModel

# Main agent uses a powerful model
main_llm = OpenAIChatModel(model="gpt-4o")

# Summarizer uses a cheap fast model
cheap_llm = OpenAIChatModel(model="gpt-4o-mini")

summarizer = AutoSummarizeMiddleware(
    summarize_every=5,
    llm=cheap_llm,  # overrides agent.llm for summarization
)

agent = ReactAgent(llm=main_llm, middleware=[summarizer])
result = agent.invoke("Research the latest breakthroughs in quantum computing")
print(result)

If llm is not provided, it falls back to agent.llm automatically.


Parameters

Parameter Type Default Description
summarize_every int | None 5 Summarize every N iterations. None = disable iteration-based trigger.
max_scratchpad_chars int 15000 Also trigger if scratchpad exceeds this many characters.
llm any None LLM for summarization. Needs .invoke(prompt). Falls back to agent.llm if not set.

Combine with other middleware

from autourgos_summarizer import AutoSummarizeMiddleware
from autourgos_history import AgentHistoryMiddleware

history    = AgentHistoryMiddleware()
summarizer = AutoSummarizeMiddleware(summarize_every=5)

agent = ReactAgent(llm=my_llm, middleware=[summarizer, history])

Requirements

  • Python 3.9+
  • Any Autourgos agent that exposes agent.scratchpad, agent.llm, and agent.query

License

MIT — see LICENSE

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autourgos_summarizer-1.0.1.tar.gz (5.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

autourgos_summarizer-1.0.1-py3-none-any.whl (6.9 kB view details)

Uploaded Python 3

File details

Details for the file autourgos_summarizer-1.0.1.tar.gz.

File metadata

  • Download URL: autourgos_summarizer-1.0.1.tar.gz
  • Upload date:
  • Size: 5.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for autourgos_summarizer-1.0.1.tar.gz
Algorithm Hash digest
SHA256 0ebd3690e9c3f065d6620029af5355d8fa7a7d5834d55482fa7074cb16173f01
MD5 6441c8f95824971e5faafea4d6caa7e7
BLAKE2b-256 37065a6f9eb884a15263e58ad0f9d2029d0316652c200f09abf59585daa18037

See more details on using hashes here.

File details

Details for the file autourgos_summarizer-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for autourgos_summarizer-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8a5da41d158f953808311d5010788ac30786ca98c522a5bf0ffbae1d69fd9c12
MD5 a852faaf463fc6de09a90082f3df8d06
BLAKE2b-256 89b12be80d160d389782d7c231988054c41dba3c1b82c51bc921647ab2b82afc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page