Skip to main content

Small, opinionated AWS Bedrock client wrapper: adaptive throttle, cache-aware cost tracking, and structured-output parse-and-repair. Single-cloud, single-purpose.

Project description

bedrock-kit

CI PyPI Python License: MIT

Small, opinionated AWS Bedrock client wrapper.

Every Bedrock production team rebuilds the same three things: adaptive throttle for ThrottlingException, per-call cost tracking that handles cache-read tokens, and structured-output parsing-with-repair. bedrock-kit ships those, and nothing else. Single-cloud, single-purpose. No proxy server. Wraps the official boto3 client; you can inject a fake for testing.

Install

pip install "bedrock-kit[boto]"
# optional
pip install "bedrock-kit[boto,pandas]"  # adds CostLedger.to_pandas()

Quickstart

from bedrock_kit import BedrockClient, AdaptiveThrottle, CostLedger

ledger = CostLedger()
client = BedrockClient(
    region="us-east-1",
    retry=AdaptiveThrottle(max_attempts=8, base_delay=1.0, max_delay=30.0),
    cost_ledger=ledger,
)

resp = client.invoke(
    model_id="anthropic.claude-sonnet-4-5",
    messages=[{"role": "user", "content": "Hello"}],
    system="You are concise.",
    max_tokens=512,
)

resp.text                       # the model's reply
resp.usage.input_tokens         # 12
resp.usage.cache_read_tokens    # 0
resp.cost_usd                   # 0.000176
resp.cache_hit                  # False
ledger.total_usd                # accumulates across all calls
ledger.by_model                 # {"anthropic.claude-sonnet-4-5": 0.000176}

Structured output with repair

from pydantic import BaseModel
from bedrock_kit import JsonSchema

class Sentiment(BaseModel):
    label: str
    confidence: float

resp = client.invoke(
    model_id="anthropic.claude-sonnet-4-5",
    messages=[{"role": "user", "content": "Classify: 'this is great!'"}],
    response_schema=JsonSchema(Sentiment, max_repair_attempts=2),
)
resp.parsed                  # Sentiment(label="positive", confidence=0.95)

If the model returns invalid JSON, bedrock-kit first does a light local repair pass (strip markdown fences, trailing commas, surrounding prose). If that still fails, it asks the model to fix its own output, up to max_repair_attempts times.

Adaptive throttle

throttle = AdaptiveThrottle(
    max_attempts=8,    # total attempts (incl. first)
    base_delay=1.0,    # seconds
    max_delay=30.0,
    jitter=True,       # full-jitter (uniform(0, capped_delay))
)

Retries on Bedrock throttle codes: ThrottlingException, TooManyRequestsException, ServiceUnavailableException, ProvisionedThroughputExceededException, ModelTimeoutException. Does not retry validation, auth, or model-not-found errors - those are your bugs to fix, not transient.

Cost tracking

CostLedger ships pricing for popular Anthropic Bedrock models. Override or extend with pricing=:

from bedrock_kit import CostLedger, Pricing

ledger = CostLedger(
    pricing={
        "amazon.nova-pro-v1:0": Pricing(
            input=0.8, output=3.2, cache_read=0.2, cache_write=1.0
        ),
    },
    strict=True,  # raise PricingNotFoundError on unknown models
)

Default pricing is best-effort and dated; verify against aws.amazon.com/bedrock/pricing before using these numbers for billing.

Why not LiteLLM?

LiteLLM is great if you need cross-provider routing. bedrock-kit is for the case where you've already decided on Bedrock, you don't want a 46k-LOC multi-provider abstraction, and you want a small surface a security team can audit. We don't proxy, don't include a server, don't ship a UI. We're < 1000 LOC of Python.

What it explicitly does NOT do

  • No multi-provider routing
  • No proxy server, no UI
  • No prompt management
  • No agent loop
  • No image generation
  • No SageMaker, Bedrock Agents, or Knowledge Bases SDK wrapping
  • No streaming or cancellation yet (planned for v0.2)
  • No OpenTelemetry emission yet (planned for v0.2)

Testing without AWS

The default BedrockClient makes a real boto3 client. For tests, inject a fake that quacks like one:

from bedrock_kit import BedrockClient, AdaptiveThrottle

class FakeClient:
    def converse(self, **kwargs):
        return {"output": {"message": {"content": [{"text": "stub"}]}},
                "stopReason": "end_turn",
                "usage": {"inputTokens": 1, "outputTokens": 1}}

client = BedrockClient(client=FakeClient(), retry=AdaptiveThrottle(sleep_fn=lambda _: None))

Status

v0.1 - alpha. Public API may change before v1.0. Issues and PRs welcome.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bedrock_kit-0.1.0.tar.gz (15.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bedrock_kit-0.1.0-py3-none-any.whl (14.5 kB view details)

Uploaded Python 3

File details

Details for the file bedrock_kit-0.1.0.tar.gz.

File metadata

  • Download URL: bedrock_kit-0.1.0.tar.gz
  • Upload date:
  • Size: 15.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for bedrock_kit-0.1.0.tar.gz
Algorithm Hash digest
SHA256 95aa73141aea738e097528d7b569caeeb8069a44b2fd5e0538ffe4895ed9b77c
MD5 54515915ab6d20aaba5c88bea91156b8
BLAKE2b-256 ae700f0023f59b6f91dc1d4bb5f4876311a3d95bfc97f7101c4cc76919e3929a

See more details on using hashes here.

File details

Details for the file bedrock_kit-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: bedrock_kit-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 14.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for bedrock_kit-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f64c22a00f302b4bac15e138a356eb25777ed828797c5b1c0a7e77eed02b51bf
MD5 a01d55fdce2f9b32ef2b59c37911a9ab
BLAKE2b-256 7f03284fba0e328da54ab8a6e6e5f483c36e2c334757a08da1c6e231b1dde5cf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page