Governance middleware for LangChain 1.0 agents — powered by axor-core compression engines

These details have not been verified by PyPI

Project description

axor-langchain

Production middleware for LangChain agents that cuts context growth without rewriting your graph.

Axor compresses stale tool-heavy history, keeps fresh tool outputs intact, enforces token budgets, filters tools, and optionally caches deterministic tool results. Add it to create_agent() as middleware and keep your existing tools, models, prompts, and LangGraph topology.

Validated on live hard-agent benchmarks:

OpenAI aggressive: 77.0% aggregate cost savings, 0.91 average judge score (3-run average)
OpenAI cautious: 69.9% aggregate cost savings, 0.92 average judge score (3-run average)
Anthropic aggressive: 35.3% aggregate cost savings, 0.94 average judge score (3-run average)
Anthropic cautious: 30.0% aggregate cost savings, 0.96 average judge score (3-run average)

Why Axor

Long-running agents do not usually fail because one prompt is large. They fail because every turn carries yesterday's logs, traces, search results, and intermediate reasoning into tomorrow's model call.

Axor focuses on that production path:

reduce repeated input tokens in multi-tool and multi-step agents
preserve the newest tool results so agents do not re-query unnecessarily
make cost controls explicit with soft and hard token limits
restrict dangerous or irrelevant tools per agent
cache read-only tool outputs across LangGraph invocations
validate savings with a live benchmark and LLM-as-judge quality check

Install

pip install axor-langchain

Provider packages are optional:

pip install "axor-langchain[anthropic]"
pip install "axor-langchain[openai]"
pip install "axor-langchain[providers]"

Quick Start

from langchain.agents import create_agent
from axor_langchain import AxorMiddleware

axor = AxorMiddleware(
    optimization_profile="cautious",
    soft_token_limit=80_000,
    hard_token_limit=120_000,
)

agent = create_agent(
    "anthropic:claude-sonnet-4-5",
    tools=tools,
    middleware=[axor],
)

result = await agent.ainvoke({
    "messages": [("user", "Investigate the checkout latency incident.")]
})

print(f"tokens spent: {axor.total_tokens_spent}")

Core Features

Feature	Production Behavior
Context compression	Uses axor-core `ContextCompressor` with pinned, knowledge, working, and ephemeral fragment semantics
Fresh-tool protection	Keeps the most recent tool outputs verbatim to avoid retry loops after compression
Policy selection	Uses axor-core task classification unless an explicit compression mode is set
Tool governance	Applies allowlists and denylists before tools reach the model
Budget guardrails	Estimates input before the call, then records provider-reported usage after the call
Tool cache	Opt-in cache for deterministic read-only tools, persisted in LangGraph state
Telemetry	Off by default; local or remote anonymized telemetry is explicit opt-in

Optimization Profiles

Use profiles first; override individual knobs only after measuring.

Profile	Use When	Settings	Expected Tradeoff
`cautious`	Initial rollout, regulated workflows, quality-sensitive agents	policy-selected compression, last 2 tool results kept verbatim, all tools available to the model	lower savings, wider quality margin
`aggressive`	High-volume hard agents with large tool outputs	aggressive compression, last 1 tool result kept verbatim, top-K=8 task-relevant tools, deduplicates repeated tool calls in old turns	highest measured savings, requires quality validation

quality_first = AxorMiddleware(optimization_profile="cautious")
cost_first = AxorMiddleware(optimization_profile="aggressive")

custom = AxorMiddleware(
    optimization_profile="aggressive",
    recent_tools_window=2,
    compression_mode="balanced",
)

Explicit recent_tools_window and compression_mode values override the profile defaults.

Tool Governance

Run different agents with different tool surfaces:

research_axor = AxorMiddleware(
    allowed_tools=["search", "read_file", "lookup_doc"],
)

review_axor = AxorMiddleware(
    denied_tools=["bash", "write_file", "delete_file"],
)

Budget Controls

axor = AxorMiddleware(
    soft_token_limit=80_000,
    hard_token_limit=120_000,
    verbose=True,
)

The hard limit is a pre-call gate. Axor estimates the next input size and raises BudgetExceededError before sending an over-budget request. Actual accounting uses the provider's usage_metadata after each model call.

Tool Result Cache

Tool caching is intentionally opt-in. Use it only for deterministic read-only tools:

axor = AxorMiddleware(
    cache_tools=["read_file", "lookup_doc"],
    max_tool_cache_entries=100,
)

Cache entries are keyed by tool name and arguments. The cache lives in AxorState.tool_result_cache, so LangGraph checkpointing can preserve it across invocations under the same thread_id.

Anthropic Prompt Caching

Axor does not write provider-specific cache_control markers. Compose it with LangChain's Anthropic middleware when you want prompt caching:

from langchain_anthropic.middleware import AnthropicPromptCachingMiddleware

agent = create_agent(
    "anthropic:claude-sonnet-4-5",
    tools=tools,
    middleware=[
        AxorMiddleware(optimization_profile="cautious"),
        AnthropicPromptCachingMiddleware(),
    ],
)

Order matters: list AxorMiddleware before AnthropicPromptCachingMiddleware so compression runs first and the Anthropic middleware places cache_control markers on the final, compressed message set. The reverse order can stamp markers onto messages that Axor then rewrites, dropping the cache hit.

Telemetry

Telemetry is off by default.

local = AxorMiddleware(telemetry="local")
remote = AxorMiddleware(telemetry="remote")

Remote telemetry sends anonymized policy and token metadata only. It does not send raw prompts, tool arguments, file contents, secrets, user IDs, or session IDs.

Configuration

AxorMiddleware(
    soft_token_limit=None,
    hard_token_limit=None,
    allowed_tools=None,
    denied_tools=None,
    personality=None,
    memory_provider=None,
    memory_namespace="axor",
    tool_error_handler=None,
    tool_max_retries=0,
    tool_retry_delay=0.0,
    track_tool_stats=False,
    cache_tools=None,
    max_tool_cache_entries=100,
    optimization_profile=None,  # None | "cautious" | "aggressive"
    token_cost_rates=None,      # optional axor_core.budget.TokenCostRates
    recent_tools_window=None,
    compression_mode=None,      # None/"auto" | "aggressive" | "balanced" | "light"
    tool_selection=None,        # None | "relevance"
    tool_top_k=None,            # cap on tools shown to the model when relevance is on
    tool_min_keep=3,            # floor so relevance never strips below this
    tool_sticky_lookback=4,     # AI turns whose tools are anchored even on low score
    tool_dedup_old_results=None,# replace old duplicate tool results with a pointer
    tool_selection_stable=True, # cache the relevance selection within a user turn
    verbose=False,
    telemetry=None,             # None | "off" | "local" | "remote"
)

The tool_selection, tool_top_k, and tool_dedup_old_results knobs are turned on automatically by optimization_profile="aggressive". Set them explicitly only after you have a reason to override the profile.

Benchmark

The supported benchmark is a live hard-agent suite:

cd axor-langchain
export ANTHROPIC_API_KEY=sk-ant-...

python benchmark/live_hard_agent.py \
  --provider anthropic \
  --task all \
  --prior-turns 10 \
  --tool-kb 10 \
  --axor-profile aggressive \
  --judge \
  --json

Run the cautious rollout profile with the same harness:

python benchmark/live_hard_agent.py \
  --provider anthropic \
  --task all \
  --prior-turns 10 \
  --tool-kb 10 \
  --axor-profile cautious \
  --judge \
  --json

It runs each task twice:

baseline: LangChain create_agent() without Axor
governed: the same agent with AxorMiddleware

The benchmark uses real provider calls, realistic prior history, large deterministic tool outputs, and optional LLM-as-judge quality scoring.

Tasks

Task	Measures
`incident_rca`	incident timeline, root cause, blast radius, mitigations
`security_migration`	OAuth migration planning, vulnerable paths, rollout, backout
`cost_optimization`	model-spend diagnosis for tool-heavy agent workflows

Validated Results

Anthropic aggressive profile, task=all, --judge (model=claude-sonnet-4-6, prior-turns=10, tool-kb=10; auto-fit narrowed prior-kb to 2 and tool-kb to 3 to fit Anthropic input TPM). Averaged over 3 independent runs:

Task	Judge Score	Verdict	Input Savings	Total Savings	Cost Savings
`incident_rca`	0.96	equivalent	55.4%	51.9%	42.0%
`security_migration`	0.92	equivalent	34.4%	32.8%	29.0%
`cost_optimization`	0.94	equivalent	45.1%	42.0%	32.7%
Aggregate	0.94 avg	equivalent	47.2%	44.0%	35.3%

Per-task numbers carry ±10pp run-to-run variance on this profile because the baseline tool-call count and depth are non-deterministic; the aggregate is stable across runs. security_migration consistently shows lower compression opportunity because the baseline tends to converge on a tighter scope here.

Anthropic cautious profile, task=all, --judge (model=claude-sonnet-4-6, prior-turns=10, tool-kb=10; auto-fit narrowed prior-kb to 2 and tool-kb to 3 to fit Anthropic input TPM). Averaged over 3 independent runs:

Task	Judge Score	Verdict	Input Savings	Total Savings	Cost Savings
`incident_rca`	0.98	equivalent	45.0%	41.9%	32.8%
`security_migration`	0.95	equivalent	30.4%	28.4%	22.6%
`cost_optimization`	0.95	equivalent	44.1%	41.3%	33.1%
Aggregate	0.96 avg	equivalent	41.9%	38.8%	30.0%

security_migration is the lowest-savings task on this profile because its baseline tends to converge on a tight scope; on a 27K-token baseline the governance overhead can briefly exceed the compression gain. On larger payloads (incident_rca, cost_optimization) the profile holds ~32–33% cost reduction.

OpenAI aggressive profile, task=all, --judge (model=gpt-4.1-mini, prior-turns=10, tool-kb=10). Averaged over 3 independent runs:

Task	Judge Score	Verdict	Input Savings	Total Savings	Cost Savings
`incident_rca`	0.92	equivalent	80.9%	79.6%	76.1%
`security_migration`	0.92	equivalent	81.1%	80.2%	77.6%
`cost_optimization`	0.88	mixed	80.7%	79.8%	77.2%
Aggregate	0.91 avg	mostly equivalent	80.9%	79.9%	77.0%

cost_optimization lands on minor_drift in 2 of 3 runs under aggressive compression — the governed response trims concrete actions (rollback steps, deadline propagation, circuit-breaker callouts) while preserving the diagnosis. If you need the action list intact for this task class, use cautious instead.

OpenAI cautious profile, task=all, --judge (model=gpt-4.1-mini, prior-turns=10, tool-kb=10). Averaged over 3 independent runs:

Task	Judge Score	Verdict	Input Savings	Total Savings	Cost Savings
`incident_rca`	0.93	equivalent	73.3%	72.2%	69.2%
`security_migration`	0.92	equivalent	73.9%	73.1%	70.6%
`cost_optimization`	0.92	equivalent	72.6%	71.9%	70.0%
Aggregate	0.92 avg	equivalent	73.3%	72.4%	69.9%

OpenAI showed the strongest aggregate cost savings on this benchmark. Under the aggressive profile, cost_optimization lands on minor_drift in 2 of 3 runs — a real cost-vs-quality tradeoff for that task class, not run-to-run noise. The Anthropic cautious profile produced the highest average judge score (0.96) with all tasks equivalent, at the cost of a smaller percentage cost reduction.

Treat profiles as deployment presets, not universal quality rankings; validate with --judge on your own workload.

Profile decision guide:

Profile	Primary Goal	Use It For	Publishable Measured Result
`cautious`	preserve quality first	staging rollout, first production cohort, sensitive agents	OpenAI: 70.9% cost savings; Anthropic: 30.6%; all tasks equivalent
`aggressive`	maximize savings with judge guardrails	high-volume production agents after validation	OpenAI: 77.2% cost savings; Anthropic: 48.5%; all tasks equivalent

Recommended rollout path:

Start with optimization_profile="cautious" in staging.
Run the benchmark with --judge on representative tasks.
Move high-volume agents to optimization_profile="aggressive" when quality scores stay acceptable.

Requirements

Python 3.11+
langchain >= 1.0.0
langgraph >= 1.0.0
axor-core

License

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.6.0

May 25, 2026

0.5.0

May 4, 2026

This version

0.4.0

May 4, 2026

0.3.1

Apr 24, 2026

0.3.0

Apr 24, 2026

0.2.0

Apr 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

axor_langchain-0.4.0.tar.gz (69.5 kB view details)

Uploaded May 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

axor_langchain-0.4.0-py3-none-any.whl (32.2 kB view details)

Uploaded May 4, 2026 Python 3

File details

Details for the file axor_langchain-0.4.0.tar.gz.

File metadata

Download URL: axor_langchain-0.4.0.tar.gz
Upload date: May 4, 2026
Size: 69.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for axor_langchain-0.4.0.tar.gz
Algorithm	Hash digest
SHA256	`ddaa73ad1992d896e4f13dcabd32ef8069468845b40470189596696901d04f90`
MD5	`d065c57f2b9c47d02e2057aea856c5f7`
BLAKE2b-256	`2c892ff5bf3479d28841890b344346ed2ea80a646fa263718da65affe0eda97a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for axor_langchain-0.4.0.tar.gz:

Publisher: ci.yml on Bucha11/axor-langchain

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: axor_langchain-0.4.0.tar.gz
- Subject digest: ddaa73ad1992d896e4f13dcabd32ef8069468845b40470189596696901d04f90
- Sigstore transparency entry: 1437203005
- Sigstore integration time: May 4, 2026
Source repository:
- Permalink: Bucha11/axor-langchain@8d6eef40ddb412604bd7773a50b45662cd79cf20
- Branch / Tag: refs/tags/v0.4.0
- Owner: https://github.com/Bucha11
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: ci.yml@8d6eef40ddb412604bd7773a50b45662cd79cf20
- Trigger Event: push

File details

Details for the file axor_langchain-0.4.0-py3-none-any.whl.

File metadata

Download URL: axor_langchain-0.4.0-py3-none-any.whl
Upload date: May 4, 2026
Size: 32.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for axor_langchain-0.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3269c19ad06d051a057ddcc7a9b3ca20c8c649283610e92fd4c66deade88cffc`
MD5	`08e6e1a475333cb68a6751258ccbb3fc`
BLAKE2b-256	`9600b90a4931411ce468017ef769e374920c0d54573b358c2299004370bdfb1a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for axor_langchain-0.4.0-py3-none-any.whl:

Publisher: ci.yml on Bucha11/axor-langchain

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: axor_langchain-0.4.0-py3-none-any.whl
- Subject digest: 3269c19ad06d051a057ddcc7a9b3ca20c8c649283610e92fd4c66deade88cffc
- Sigstore transparency entry: 1437203009
- Sigstore integration time: May 4, 2026
Source repository:
- Permalink: Bucha11/axor-langchain@8d6eef40ddb412604bd7773a50b45662cd79cf20
- Branch / Tag: refs/tags/v0.4.0
- Owner: https://github.com/Bucha11
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: ci.yml@8d6eef40ddb412604bd7773a50b45662cd79cf20
- Trigger Event: push

axor-langchain 0.4.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

axor-langchain

Why Axor

Install

Quick Start

Core Features

Optimization Profiles

Tool Governance

Budget Controls

Tool Result Cache

Anthropic Prompt Caching

Telemetry

Configuration

Benchmark

Tasks

Validated Results

Requirements

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance