Skip to main content

Claude-native token optimization for agentic pipelines

Project description

Beskar

CI Coverage: ≥90% License: MIT Node 18+ Python 3.9+

Claude-native token optimization for agentic pipelines.

Beskar wraps the Anthropic SDK to automatically cut token costs in production agentic loops — through intelligent prompt caching, context pruning, tool result compression, and a metrics layer that proves what you're saving.


The Problem

Agentic pipelines burn tokens in predictable, fixable ways:

  • The same system prompt and tool definitions are re-tokenized on every turn
  • Long conversation histories accumulate with no pruning strategy
  • Tool results carry full verbose payloads even after they're no longer useful
  • Extended thinking mode can spike costs unpredictably
  • Sonnet runs tasks that Haiku could handle at 1/20th the price

In a high-volume pipeline — bug bounty tooling, research agents, code generation loops — this waste compounds fast. Beskar fixes it at the SDK layer, before your application code sees it.


V1 Features

Prompt Caching Auto-Structurer

Automatically places cache_control breakpoints at the optimal positions in your messages array. Respects Claude's minimum token thresholds (1024 tokens for Sonnet/Opus, 2048 for Haiku), honors the 4-breakpoint limit, and prioritizes stable content (system prompt, tool definitions, long context documents) over dynamic content. Cache hits cut input costs by ~90% on the cached portion.

Context Window Pruner

Manages rolling context for long agentic loops. Configurable strategies: sliding window (drop oldest turns), summarization (collapse old turns into a summary message), or importance scoring. Preserves tool call integrity — never drops a tool_use turn without its corresponding tool_result.

Tool Result Compressor

Intercepts tool results before they're appended to context. Strips non-essential fields, truncates oversized payloads, and collapses completed tool call chains into summarized form once they're no longer needed for active reasoning. Preserves tool_use_id linkage throughout.

Token Metrics Layer

Wraps every API call to capture the usage object from Claude's response. Tracks input tokens, output tokens, cache creation tokens, and cache read tokens. Derives cache hit rate, estimated cost (by model), and tokens saved vs. an uncached baseline. Makes optimization measurable.


Installation

TypeScript / Node.js

npm install beskar

Requires Node 18+. The anthropic SDK is a peer dependency — install it alongside:

npm install beskar @anthropic-ai/sdk

Python

Note: The Python package is not yet published to PyPI. Install locally from source:

git clone https://github.com/Vligai/beskar.git
cd beskar/python
pip install -e .

Requires Python 3.9+. The anthropic SDK is pulled in automatically as a dependency.

For development (tests + type checking):

pip install -e ".[dev]"

Usage

Beskar is a drop-in replacement for anthropic.messages.create().

TypeScript

import { BeskarClient } from 'beskar';

const client = new BeskarClient({
  apiKey: process.env.ANTHROPIC_API_KEY,
  cache: { enabled: true },
  pruner: { strategy: 'sliding-window', maxTurns: 20 },
  compressor: { enabled: true, maxToolResultTokens: 500 },
  metrics: { enabled: true },
});

const response = await client.messages.create({
  model: 'claude-sonnet-4-6',
  max_tokens: 1024,
  system: 'You are a security researcher...',
  messages: conversationHistory,
});

console.log(client.metrics.summary());
// { cacheHitRate: 0.87, estimatedCost: '$0.0023', tokensSaved: 14200 }

Python

import os
from beskar import BeskarClient
from beskar.types import BeskarConfig, CacheConfig, PrunerConfig, CompressorConfig, MetricsConfig

client = BeskarClient(BeskarConfig(
    api_key=os.environ["ANTHROPIC_API_KEY"],
    cache=CacheConfig(),
    pruner=PrunerConfig(strategy="sliding-window", max_turns=20),
    compressor=CompressorConfig(max_tool_result_tokens=500),
    metrics=MetricsConfig(),
))

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    system="You are a security researcher...",
    messages=conversation_history,
)

print(client.metrics.summary())
# MetricsSummary(cache_hit_rate=0.87, estimated_cost_usd=0.0023, estimated_savings_usd=0.012, ...)

Roadmap

V2 — After V1 ships with real usage data:

  • Extended thinking budget control — Per-call thinking budgets with task-complexity detection to toggle thinking contextually rather than globally
  • Model routing — Automatically route subtasks to Haiku vs. Sonnet based on complexity signals (extraction vs. reasoning, output length, tool depth)
  • Output filler cleanup — Post-processor trained on Claude's characteristic filler patterns ("Certainly!", disclaimer stacking) for reliable cleanup
  • System prompt auditor — Scores and rewrites system prompts for token efficiency without degrading behavior
  • Model-agnostic support — Abstract the Claude-specific layer to support other providers in V2

License

MIT © 2026 Vlad Ligai

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

beskar-0.1.0a1.tar.gz (18.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

beskar-0.1.0a1-py3-none-any.whl (11.6 kB view details)

Uploaded Python 3

File details

Details for the file beskar-0.1.0a1.tar.gz.

File metadata

  • Download URL: beskar-0.1.0a1.tar.gz
  • Upload date:
  • Size: 18.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for beskar-0.1.0a1.tar.gz
Algorithm Hash digest
SHA256 13d3e55731121517c34e8466a68e8f3d55a45d4d1ea5cd1271a85d418691b2ee
MD5 90d4fc7ce2c65419b5b62a3e51ebc391
BLAKE2b-256 3467ae6c0e68d4e6a7b1c42436600b2bd7df5b4ee71e2940c463d04e222ba8ac

See more details on using hashes here.

Provenance

The following attestation bundles were made for beskar-0.1.0a1.tar.gz:

Publisher: publish.yml on Vligai/beskar

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file beskar-0.1.0a1-py3-none-any.whl.

File metadata

  • Download URL: beskar-0.1.0a1-py3-none-any.whl
  • Upload date:
  • Size: 11.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for beskar-0.1.0a1-py3-none-any.whl
Algorithm Hash digest
SHA256 88ff35b2e44ab113c1552ef5305ed48ce7b242e61acff63d0a9c83b644475d48
MD5 b3dc650deb5d7e6f5317f3ee3d2717ee
BLAKE2b-256 79c3f14e616ef518017ced61265c76568ba6e2aa5a082134f7fe0a9660b66631

See more details on using hashes here.

Provenance

The following attestation bundles were made for beskar-0.1.0a1-py3-none-any.whl:

Publisher: publish.yml on Vligai/beskar

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page