Automatic Conversation Summarization and History Management for Pydantic AI

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

kacperwlodarczyk-vstorm

These details have not been verified by PyPI

Project description

Context Management for Pydantic AI

Automatic Conversation Summarization and History Management

Intelligent Summarization — LLM-powered context compression • Sliding Window — zero-cost message trimming • Limit Warnings — finish-soon guidance before hard caps • Context Manager — real-time token tracking + tool truncation • Safe Cutoff — preserves tool call pairs

Context Management for Pydantic AI helps your Pydantic AI agents handle long conversations without exceeding model context limits. Choose between intelligent LLM summarization or fast sliding window trimming.

Full framework? Check out Pydantic Deep Agents — complete agent framework with planning, filesystem, subagents, and skills.

Use Cases

What You Want to Build	How This Library Helps
Long-Running Agent	Automatically compress history when context fills up
Customer Support Bot	Preserve key details while discarding routine exchanges
Code Assistant	Keep recent code context, summarize older discussions
High-Throughput App	Zero-cost sliding window for maximum speed
Cost-Sensitive App	Choose between quality (summarization) or free (sliding window)

Installation

pip install summarization-pydantic-ai

Or with uv:

uv add summarization-pydantic-ai

For accurate token counting:

pip install summarization-pydantic-ai[tiktoken]

Quick Start — Capabilities (Recommended)

The recommended way to add context management is via pydantic-ai's native Capabilities API:

from pydantic_ai import Agent
from pydantic_ai_summarization import ContextManagerCapability

agent = Agent(
    "anthropic:claude-sonnet-4-6",
    capabilities=[ContextManagerCapability(max_tokens=100_000)],
)

result = await agent.run("Hello!")

That's it. Your agent now:

Tracks token usage on every turn
Auto-compresses when approaching the limit (90% by default)
Truncates large tool outputs
Auto-detects context window size from the model
Preserves tool call/response pairs (never breaks them)

Agent-Triggered Compression

Let the agent decide when to compress by enabling the compact_conversation tool:

agent = Agent(
    "anthropic:claude-sonnet-4-6",
    capabilities=[ContextManagerCapability(
        include_compact_tool=True,  # Adds compact_conversation(focus?) tool
    )],
)

The agent can call compact_conversation(focus="preserve API design decisions") to trigger compression with a focus topic. Compression is deferred to the next model request.

Combine with Limit Warnings

from pydantic_ai_summarization import ContextManagerCapability, LimitWarnerCapability

agent = Agent(
    "openai:gpt-4.1",
    capabilities=[
        LimitWarnerCapability(max_iterations=40, max_context_tokens=100_000),
        ContextManagerCapability(max_tokens=100_000),
    ],
)

Alternative: Processor API

For standalone use without capabilities:

from pydantic_ai import Agent
from pydantic_ai_summarization import create_summarization_processor

processor = create_summarization_processor(
    trigger=("tokens", 100000),
    keep=("messages", 20),
)

agent = Agent("openai:gpt-4.1", history_processors=[processor])

Available Processors

Processor	LLM Cost	Latency	Context Preservation
`ContextManagerCapability`	Per compression	Low tracking	Intelligent summary + tool truncation
`SummarizationProcessor`	High	High	Intelligent summary
`SlidingWindowProcessor`	Zero	~0ms	Discards old messages
`LimitWarnerProcessor`	Zero	~0ms	Full history + warning injection

Intelligent Summarization

Uses an LLM to create summaries of older messages:

from pydantic_ai_summarization import create_summarization_processor

processor = create_summarization_processor(
    trigger=("tokens", 100000),  # When to summarize
    keep=("messages", 20),       # What to keep
)

Zero-Cost Sliding Window

Simply discards old messages — no LLM calls:

from pydantic_ai_summarization import create_sliding_window_processor

processor = create_sliding_window_processor(
    trigger=("messages", 100),  # When to trim
    keep=("messages", 50),      # What to keep
)

Limit Warnings

Warn the agent before requests, context usage, or total tokens hit a cap:

from pydantic_ai_summarization import create_limit_warner_processor

processor = create_limit_warner_processor(
    max_iterations=40,
    max_context_tokens=100000,
    max_total_tokens=200000,
)

Context Manager Capability

Full context management with token tracking, auto-compression, and tool output truncation:

from pydantic_ai import Agent
from pydantic_ai_summarization import ContextManagerCapability

agent = Agent(
    "anthropic:claude-sonnet-4-6",
    capabilities=[ContextManagerCapability(
        max_tokens=100_000,
        compress_threshold=0.9,
        max_tool_output_tokens=5000,
        include_compact_tool=True,  # Agent gets a compact_conversation tool
    )],
)

Trigger Types

Type	Example	Description
`messages`	`("messages", 50)`	Trigger when message count exceeds threshold
`tokens`	`("tokens", 100000)`	Trigger when token count exceeds threshold
`fraction`	`("fraction", 0.8)`	Trigger at percentage of max_input_tokens

Keep Types

Type	Example	Description
`messages`	`("messages", 20)`	Keep last N messages
`tokens`	`("tokens", 10000)`	Keep last N tokens worth
`fraction`	`("fraction", 0.2)`	Keep last N% of context

Advanced Configuration

Multiple Triggers

from pydantic_ai_summarization import SummarizationProcessor

processor = SummarizationProcessor(
    model="openai:gpt-4o",
    trigger=[
        ("messages", 50),    # OR 50+ messages
        ("tokens", 100000),  # OR 100k+ tokens
    ],
    keep=("messages", 10),
)

Fraction-Based

processor = SummarizationProcessor(
    model="openai:gpt-4o",
    trigger=("fraction", 0.8),  # 80% of context window
    keep=("fraction", 0.2),     # Keep last 20%
    max_input_tokens=128000,    # GPT-4's context window
)

Custom Token Counter

def my_token_counter(messages):
    return sum(len(str(msg)) for msg in messages) // 4

processor = create_summarization_processor(
    token_counter=my_token_counter,
)

Custom Model (e.g., Azure OpenAI)

from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.providers.openai import OpenAIProvider
from pydantic_ai_summarization import create_summarization_processor

azure_model = OpenAIModel(
    "gpt-4o",
    provider=OpenAIProvider(
        base_url="https://my-resource.openai.azure.com/openai/deployments/gpt-4o",
        api_key="your-azure-api-key",
    ),
)

processor = create_summarization_processor(
    model=azure_model,
    trigger=("tokens", 100000),
    keep=("messages", 20),
)

Custom Summary Prompt

processor = create_summarization_processor(
    summary_prompt="""
    Extract key information from this conversation.
    Focus on: decisions made, code written, pending tasks.

    Conversation:
    {messages}
    """,
)

Why Choose This Library?

Feature	Description
Two Strategies	Intelligent summarization or fast sliding window
Flexible Triggers	Message count, token count, or fraction-based
Safe Cutoff	Never breaks tool call/response pairs
Auto max_tokens	Auto-detect context window from genai-prices
Message Persistence	Save all messages to JSON for session resume
Guided Compaction	Focus summaries on specific topics
Callbacks	on_before/after_compress with instruction re-injection
Async Token Counting	Sync or async token counter support
Token Tracking	Real-time usage monitoring with callbacks
Tool Truncation	Automatic truncation of large tool outputs
Custom Models	Use any pydantic-ai Model (Azure, custom providers)
Lightweight	Only requires pydantic-ai-slim (no extra model SDKs)

Related Projects

Package	Description
Pydantic Deep Agents	Full agent framework (uses this library)
pydantic-ai-backend	File storage and Docker sandbox
pydantic-ai-todo	Task planning toolset
subagents-pydantic-ai	Multi-agent orchestration
pydantic-ai	The foundation — agent framework by Pydantic

Contributing

git clone https://github.com/vstorm-co/summarization-pydantic-ai.git
cd summarization-pydantic-ai
make install
make test  # 100% coverage required

License

MIT — see LICENSE

Need help implementing this in your company?

We're Vstorm — an Applied Agentic AI Engineering Consultancy
with 30+ production AI agent implementations.

Made with ❤️ by Vstorm

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

kacperwlodarczyk-vstorm

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.10

Jun 25, 2026

0.1.9

Jun 22, 2026

0.1.8

Jun 17, 2026

0.1.7

Jun 3, 2026

0.1.6

Jun 1, 2026

0.1.5

May 24, 2026

0.1.4

Apr 9, 2026

0.1.3

Apr 2, 2026

0.1.2

Mar 31, 2026

0.1.1

Mar 28, 2026

0.1.0

Mar 28, 2026

0.0.5

Mar 21, 2026

0.0.4

Feb 25, 2026

0.0.3

Feb 15, 2026

0.0.2

Jan 22, 2026

0.0.1

Jan 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

summarization_pydantic_ai-0.1.10.tar.gz (173.0 kB view details)

Uploaded Jun 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

summarization_pydantic_ai-0.1.10-py3-none-any.whl (29.6 kB view details)

Uploaded Jun 25, 2026 Python 3

File details

Details for the file summarization_pydantic_ai-0.1.10.tar.gz.

File metadata

Download URL: summarization_pydantic_ai-0.1.10.tar.gz
Upload date: Jun 25, 2026
Size: 173.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for summarization_pydantic_ai-0.1.10.tar.gz
Algorithm	Hash digest
SHA256	`a4ff914bdd93bba126a954a29a3c98907a94b2c49a26cc99e4e2d50b47bee924`
MD5	`d6b48b13f95cd3a6a271d546e73fed65`
BLAKE2b-256	`d33a22a3cd65f7cd17463e895d71de443629804b0aed085ced12f40bdc06ff2b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for summarization_pydantic_ai-0.1.10.tar.gz:

Publisher: publish.yml on vstorm-co/summarization-pydantic-ai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: summarization_pydantic_ai-0.1.10.tar.gz
- Subject digest: a4ff914bdd93bba126a954a29a3c98907a94b2c49a26cc99e4e2d50b47bee924
- Sigstore transparency entry: 1958331062
- Sigstore integration time: Jun 25, 2026
Source repository:
- Permalink: vstorm-co/summarization-pydantic-ai@22b4dfe1e873ef6e3daff84401303bfa448b8912
- Branch / Tag: refs/tags/0.1.10
- Owner: https://github.com/vstorm-co
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@22b4dfe1e873ef6e3daff84401303bfa448b8912
- Trigger Event: release

File details

Details for the file summarization_pydantic_ai-0.1.10-py3-none-any.whl.

File metadata

Download URL: summarization_pydantic_ai-0.1.10-py3-none-any.whl
Upload date: Jun 25, 2026
Size: 29.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for summarization_pydantic_ai-0.1.10-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d9e123595bae90e721c2637d873947ecfe13e0c281132bcc003333e49792b250`
MD5	`c136541669fcfb4387d68b8db1190878`
BLAKE2b-256	`e0239443585a57dedfe1db912ff5e86c8653f2f7409f9ec2da8ebcaa2d56301f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for summarization_pydantic_ai-0.1.10-py3-none-any.whl:

Publisher: publish.yml on vstorm-co/summarization-pydantic-ai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: summarization_pydantic_ai-0.1.10-py3-none-any.whl
- Subject digest: d9e123595bae90e721c2637d873947ecfe13e0c281132bcc003333e49792b250
- Sigstore transparency entry: 1958331156
- Sigstore integration time: Jun 25, 2026
Source repository:
- Permalink: vstorm-co/summarization-pydantic-ai@22b4dfe1e873ef6e3daff84401303bfa448b8912
- Branch / Tag: refs/tags/0.1.10
- Owner: https://github.com/vstorm-co
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@22b4dfe1e873ef6e3daff84401303bfa448b8912
- Trigger Event: release

summarization-pydantic-ai 0.1.10

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Context Management for Pydantic AI

Use Cases

Installation

Quick Start — Capabilities (Recommended)

Agent-Triggered Compression

Combine with Limit Warnings

Alternative: Processor API

Available Processors

Intelligent Summarization

Zero-Cost Sliding Window

Limit Warnings

Context Manager Capability

Trigger Types

Keep Types

Advanced Configuration

Multiple Triggers

Fraction-Based

Custom Token Counter

Custom Model (e.g., Azure OpenAI)

Custom Summary Prompt

Why Choose This Library?

Related Projects

Contributing

License

Need help implementing this in your company?

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance