Automatic conversation summarization for pydantic-ai agents
Project description
summarization-pydantic-ai
Automatic conversation summarization and context management for pydantic-ai agents.
Looking for a complete agent framework? Check out pydantic-deep - a full-featured deep agent framework with planning, subagents, and skills system.
Need file operations? Check out pydantic-ai-backend - file storage and sandbox backends for AI agents.
Documentation
Full Documentation - Installation, concepts, examples, and API reference.
Installation
pip install summarization-pydantic-ai
# With tiktoken for accurate token counting
pip install summarization-pydantic-ai[tiktoken]
Available Processors
This library provides two history processors for managing conversation context:
| Processor | LLM Cost | Latency | Context Preservation | Use Case |
|---|---|---|---|---|
SummarizationProcessor |
High | High | Intelligent summary | When context quality matters |
SlidingWindowProcessor |
Zero | ~0ms | Discards old messages | When speed/cost matters |
Quick Start
SummarizationProcessor - Intelligent Summarization
Uses an LLM to create intelligent summaries of older messages:
from pydantic_ai import Agent
from pydantic_ai_summarization import create_summarization_processor
# Create a processor that triggers at 100k tokens and keeps 20 messages
processor = create_summarization_processor(
trigger=("tokens", 100000),
keep=("messages", 20),
)
agent = Agent(
"openai:gpt-4.1",
history_processors=[processor],
)
# The processor will automatically summarize older messages
# when the conversation grows too long
result = await agent.run("Hello!")
SlidingWindowProcessor - Zero-Cost Trimming
Simply discards old messages without LLM calls - fastest and cheapest option:
from pydantic_ai import Agent
from pydantic_ai_summarization import create_sliding_window_processor
# Keep last 50 messages when reaching 100
processor = create_sliding_window_processor(
trigger=("messages", 100),
keep=("messages", 50),
)
agent = Agent(
"openai:gpt-4.1",
history_processors=[processor],
)
# Old messages are simply discarded - no LLM cost
result = await agent.run("Hello!")
Multiple Triggers
Both processors support triggering based on multiple conditions:
from pydantic_ai_summarization import SummarizationProcessor, SlidingWindowProcessor
# Summarization with multiple triggers
processor = SummarizationProcessor(
model="openai:gpt-4.1",
trigger=[
("messages", 50), # OR 50+ messages
("tokens", 100000), # OR 100k+ tokens
],
keep=("messages", 10),
)
# Sliding window with multiple triggers
processor = SlidingWindowProcessor(
trigger=[
("messages", 100),
("tokens", 50000),
],
keep=("messages", 30),
)
Fraction-Based Configuration
Trigger when reaching a percentage of the model's context window:
from pydantic_ai_summarization import SummarizationProcessor, SlidingWindowProcessor
# Summarization at 80% of context
processor = SummarizationProcessor(
model="openai:gpt-4.1",
trigger=("fraction", 0.8), # 80% of context window
keep=("fraction", 0.2), # Keep last 20%
max_input_tokens=128000, # GPT-4's context window
)
# Sliding window at 80% of context
processor = SlidingWindowProcessor(
trigger=("fraction", 0.8),
keep=("fraction", 0.3),
max_input_tokens=128000,
)
Custom Token Counter
Use a custom token counting function with either processor:
from pydantic_ai_summarization import (
create_summarization_processor,
create_sliding_window_processor,
)
def my_token_counter(messages):
# Your custom token counting logic
return sum(len(str(msg)) for msg in messages) // 4
# With summarization
processor = create_summarization_processor(
token_counter=my_token_counter,
)
# With sliding window
processor = create_sliding_window_processor(
token_counter=my_token_counter,
)
Custom Summary Prompt
Customize how summaries are generated (SummarizationProcessor only):
from pydantic_ai_summarization import create_summarization_processor
processor = create_summarization_processor(
summary_prompt="""
Extract the key information from this conversation.
Focus on: decisions made, code written, and pending tasks.
Conversation:
{messages}
""",
)
Trigger Types
| Type | Example | Description |
|---|---|---|
messages |
("messages", 50) |
Trigger when message count exceeds threshold |
tokens |
("tokens", 100000) |
Trigger when token count exceeds threshold |
fraction |
("fraction", 0.8) |
Trigger at percentage of max_input_tokens |
Keep Types
| Type | Example | Description |
|---|---|---|
messages |
("messages", 20) |
Keep last N messages after processing |
tokens |
("tokens", 10000) |
Keep last N tokens worth of messages |
fraction |
("fraction", 0.2) |
Keep last N% of max_input_tokens |
How It Works
SummarizationProcessor
- Monitoring: Tracks token count on every call
- Trigger Check: When any trigger condition is met, summarization begins
- Safe Cutoff: Finds a safe point to cut that doesn't split tool call pairs
- Summarization: Uses an LLM to generate a summary of older messages
- Replacement: Older messages are replaced with a summary message
SlidingWindowProcessor
- Monitoring: Tracks message/token count on every call
- Trigger Check: When any trigger condition is met, trimming begins
- Safe Cutoff: Finds a safe point to cut that doesn't split tool call pairs
- Trimming: Older messages are simply discarded (no LLM call)
API Reference
SummarizationProcessor
@dataclass
class SummarizationProcessor:
model: str # Model for generating summaries
trigger: ContextSize | list[ContextSize] | None # When to trigger
keep: ContextSize # How much to keep
token_counter: TokenCounter # Token counting function
summary_prompt: str # Prompt template
max_input_tokens: int | None # Required for fraction-based
trim_tokens_to_summarize: int | None # Limit summary input
SlidingWindowProcessor
@dataclass
class SlidingWindowProcessor:
trigger: ContextSize | list[ContextSize] | None # When to trigger
keep: ContextSize # How much to keep
token_counter: TokenCounter # Token counting function
max_input_tokens: int | None # Required for fraction-based
Factory Functions
# Summarization with defaults
def create_summarization_processor(
model: str = "openai:gpt-4.1",
trigger: ContextSize | list[ContextSize] | None = ("tokens", 170000),
keep: ContextSize = ("messages", 20),
max_input_tokens: int | None = None,
token_counter: TokenCounter | None = None,
summary_prompt: str | None = None,
) -> SummarizationProcessor
# Sliding window with defaults
def create_sliding_window_processor(
trigger: ContextSize | list[ContextSize] | None = ("messages", 100),
keep: ContextSize = ("messages", 50),
max_input_tokens: int | None = None,
token_counter: TokenCounter | None = None,
) -> SlidingWindowProcessor
Choosing a Processor
Use SummarizationProcessor when:
- Context quality is important
- You need to preserve key information from long conversations
- LLM cost is acceptable
Use SlidingWindowProcessor when:
- Speed and cost are priorities
- Recent context is most important
- You're running many parallel conversations
- You want deterministic, predictable behavior
Development
git clone https://github.com/vstorm-co/summarization-pydantic-ai.git
cd summarization-pydantic-ai
make install
make test
Related Projects
- pydantic-ai - Agent framework by Pydantic
- pydantic-deep - Full agent framework (uses this library)
- pydantic-ai-backend - File storage and sandbox backends
- pydantic-ai-todo - Task planning toolset
License
MIT License - see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file summarization_pydantic_ai-0.0.1.tar.gz.
File metadata
- Download URL: summarization_pydantic_ai-0.0.1.tar.gz
- Upload date:
- Size: 248.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
de662ecd3aa6fcc04f08796dcd1d2eaca65ee9a1e0954246b18606ccda488bca
|
|
| MD5 |
8667a3a1cd65d421b1c0067de0f390ac
|
|
| BLAKE2b-256 |
ab0a75bd08d4212dc4474e40d6f610964d3632cd7dce869f5a8a65e125e78be3
|
Provenance
The following attestation bundles were made for summarization_pydantic_ai-0.0.1.tar.gz:
Publisher:
publish.yml on vstorm-co/summarization-pydantic-ai
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
summarization_pydantic_ai-0.0.1.tar.gz -
Subject digest:
de662ecd3aa6fcc04f08796dcd1d2eaca65ee9a1e0954246b18606ccda488bca - Sigstore transparency entry: 839024531
- Sigstore integration time:
-
Permalink:
vstorm-co/summarization-pydantic-ai@6f4a507a892188be24fd23f5d8766643e51ae79b -
Branch / Tag:
refs/tags/0.0.1-fix - Owner: https://github.com/vstorm-co
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@6f4a507a892188be24fd23f5d8766643e51ae79b -
Trigger Event:
release
-
Statement type:
File details
Details for the file summarization_pydantic_ai-0.0.1-py3-none-any.whl.
File metadata
- Download URL: summarization_pydantic_ai-0.0.1-py3-none-any.whl
- Upload date:
- Size: 15.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b84aa8741fad1becc38fa240c4caf9095c8c913e66904f395c86563a1e3a39dc
|
|
| MD5 |
21975ba9f380d12cd1a781a463bd35f7
|
|
| BLAKE2b-256 |
d8670f73ad4171111181d745f10a928eff6a93813127135f682e56150a0ec91c
|
Provenance
The following attestation bundles were made for summarization_pydantic_ai-0.0.1-py3-none-any.whl:
Publisher:
publish.yml on vstorm-co/summarization-pydantic-ai
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
summarization_pydantic_ai-0.0.1-py3-none-any.whl -
Subject digest:
b84aa8741fad1becc38fa240c4caf9095c8c913e66904f395c86563a1e3a39dc - Sigstore transparency entry: 839024550
- Sigstore integration time:
-
Permalink:
vstorm-co/summarization-pydantic-ai@6f4a507a892188be24fd23f5d8766643e51ae79b -
Branch / Tag:
refs/tags/0.0.1-fix - Owner: https://github.com/vstorm-co
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@6f4a507a892188be24fd23f5d8766643e51ae79b -
Trigger Event:
release
-
Statement type: