Skip to main content

Async Python wrapper for AWS Bedrock LLMs with caching, retries, and structured outputs

Project description

AWS Bedrock Wrapper

A modern, async Python wrapper for AWS Bedrock LLMs with built-in caching, structured outputs, and streaming support.

Features

  • 🚀 Async/await support - Built on aioboto3 for high-performance async operations
  • 💾 Smart caching - Automatic response caching for deterministic requests (temperature=0)
  • 📊 Structured outputs - Type-safe responses using Pydantic models
  • 🌊 Streaming - Real-time token streaming for both text and conversations
  • 🎯 Multi-model support - Works with Claude, Llama, Mistral, and other Bedrock models
  • 🔄 Automatic retries - Exponential backoff for transient failures
  • 🔧 Simple configuration - Environment variables or explicit config
  • 📝 Colored logging - Beautiful, informative console output

Installation

pip install aws-bedrock-wrapper

Quick Start

import asyncio
from aws_bedrock_wrapper import BedrockLLMClient, TextRequest

async def main():
    async with BedrockLLMClient() as client:
        response = await client.generate_text(TextRequest(
            prompt="Explain quantum computing in one sentence",
            model="anthropic.claude-3-sonnet-20240229-v1:0"
        ))
        print(response.text)

asyncio.run(main())

Configuration

Environment Variables

export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_REGION="us-east-1"
export BEDROCK_MODEL="anthropic.claude-3-sonnet-20240229-v1:0"
export BEDROCK_MAX_RETRIES="3"
export BEDROCK_RETRY_DELAY="1.0"

Explicit Configuration

from aws_bedrock_wrapper import BedrockConfig, BedrockLLMClient

config = BedrockConfig(
    aws_access_key_id="your-key",
    aws_secret_access_key="your-secret",
    aws_region="us-east-1",
    default_model="anthropic.claude-3-sonnet-20240229-v1:0",
    temperature=0.7,
    max_tokens=2048,
    max_retries=3,
    retry_delay=1.0,
    max_retry_delay=60.0
)

async with BedrockLLMClient(config=config) as client:
    # Use client...
    pass

Usage Examples

Basic Text Generation

from aws_bedrock_wrapper import BedrockLLMClient, TextRequest

async with BedrockLLMClient() as client:
    response = await client.generate_text(TextRequest(
        prompt="Write a haiku about Python",
        temperature=0.7,
        max_tokens=100
    ))
    print(response.text)
    print(f"Tokens: {response.input_tokens} in / {response.output_tokens} out")

Structured Outputs with Pydantic

from pydantic import BaseModel, Field
from aws_bedrock_wrapper import BedrockLLMClient, TextRequest

class Recipe(BaseModel):
    """A cooking recipe"""
    name: str = Field(description="Recipe name")
    ingredients: list[str] = Field(description="List of ingredients")
    steps: list[str] = Field(description="Cooking steps")
    prep_time_minutes: int = Field(description="Preparation time")

async with BedrockLLMClient() as client:
    response = await client.generate_text(TextRequest(
        prompt="Give me a simple pasta recipe",
        response_format=Recipe
    ))
    
    recipe = response.structured_data
    print(f"Recipe: {recipe.name}")
    print(f"Ingredients: {', '.join(recipe.ingredients)}")
    print(f"Prep time: {recipe.prep_time_minutes} minutes")

Streaming Responses

from aws_bedrock_wrapper import BedrockLLMClient, TextRequest

async with BedrockLLMClient() as client:
    async for chunk in client.generate_text_stream(TextRequest(
        prompt="Write a short story about a robot",
        temperature=0.8
    )):
        print(chunk.text, end="", flush=True)

Multi-turn Conversations

from aws_bedrock_wrapper import BedrockLLMClient, MessageRequest, Message

async with BedrockLLMClient() as client:
    response = await client.send_message(MessageRequest(
        messages=[
            Message(role="user", content="What is Python?"),
            Message(role="assistant", content="Python is a programming language."),
            Message(role="user", content="What are its main features?")
        ],
        system_prompt="You are a helpful programming tutor."
    ))
    print(response.text)

Caching

Responses are automatically cached when temperature=0 (deterministic):

# First call - hits API
response1 = await client.generate_text(TextRequest(
    prompt="What is 2+2?",
    temperature=0  # Enables caching
))

# Second call - instant cache hit!
response2 = await client.generate_text(TextRequest(
    prompt="What is 2+2?",
    temperature=0
))

# Clear cache for specific request
response3 = await client.generate_text(TextRequest(
    prompt="What is 2+2?",
    temperature=0,
    clear_cache=True  # Clears and regenerates
))

# Bypass cache
response4 = await client.generate_text(TextRequest(
    prompt="What is 2+2?",
    temperature=0,
    use_cache=False  # Skip cache lookup
))

List Available Models

from aws_bedrock_wrapper import get_available_model_ids

model_ids = await get_available_model_ids()
print(f"Available models: {model_ids}")

API Reference

BedrockLLMClient

Main client for interacting with AWS Bedrock.

Methods:

  • generate_text(request: TextRequest) -> TextResponse - Generate text from prompt
  • generate_text_stream(request: TextRequest) -> AsyncIterator[StreamChunk] - Stream text generation
  • send_message(request: MessageRequest) -> TextResponse - Multi-turn conversation
  • send_message_stream(request: MessageRequest) -> AsyncIterator[StreamChunk] - Stream conversation
  • list_available_models() -> List[Dict] - List available Bedrock models

TextRequest

Request parameters for text generation.

Fields:

  • prompt: str - Input prompt
  • model: Optional[str] - Model ID (uses default if not specified)
  • temperature: Optional[float] - Sampling temperature (0.0-1.0)
  • max_tokens: Optional[int] - Maximum tokens to generate
  • top_p: Optional[float] - Nucleus sampling parameter
  • top_k: Optional[int] - Top-k sampling parameter
  • system_prompt: Optional[str] - System prompt for Claude models
  • stream: bool - Enable streaming (default: False)
  • response_format: Optional[Type[BaseModel]] - Pydantic model for structured output
  • use_cache: bool - Use cache if available (default: True)
  • clear_cache: bool - Clear cache before request (default: False)

TextResponse

Response from LLM.

Fields:

  • text: str - Generated text
  • model: str - Model used
  • stop_reason: str - Why generation stopped
  • input_tokens: int - Input token count
  • output_tokens: int - Output token count
  • metadata: Dict[str, Any] - Additional metadata
  • structured_data: Optional[BaseModel] - Parsed structured output

BedrockConfig

Configuration class for AWS Bedrock client.

Parameters:

  • aws_access_key_id: Optional[str] - AWS access key
  • aws_secret_access_key: Optional[str] - AWS secret key
  • aws_session_token: Optional[str] - AWS session token
  • aws_region: Optional[str] - AWS region (default: us-east-1)
  • default_model: Optional[str] - Default model ID
  • temperature: Optional[float] - Default temperature (default: 0)
  • max_tokens: Optional[int] - Default max tokens (default: 2048)
  • top_p: Optional[float] - Default top_p (default: 0.9)
  • top_k: Optional[int] - Default top_k (default: 250)
  • max_retries: Optional[int] - Max retry attempts (default: 3)
  • retry_delay: Optional[float] - Initial retry delay in seconds (default: 1.0)
  • max_retry_delay: Optional[float] - Max retry delay in seconds (default: 60.0)

Supported Models

  • Anthropic Claude - Claude 3 (Opus, Sonnet, Haiku), Claude 2
  • Meta Llama - Llama 2, Llama 3
  • Mistral AI - Mistral 7B, Mixtral
  • Amazon Titan - Titan Text models

Development

Setup

git clone https://github.com/yourusername/aws-bedrock-wrapper.git
cd aws-bedrock-wrapper
pip install -e ".[dev]"

Run Tests

python examples/test_wrapper.py
python examples/test_structured.py
python examples/test_cache.py

License

MIT License - see LICENSE file for details.

Contributing

Contributions welcome! Please open an issue or PR.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aws_bedrock_wrapper-0.1.0.tar.gz (18.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aws_bedrock_wrapper-0.1.0-py3-none-any.whl (17.5 kB view details)

Uploaded Python 3

File details

Details for the file aws_bedrock_wrapper-0.1.0.tar.gz.

File metadata

  • Download URL: aws_bedrock_wrapper-0.1.0.tar.gz
  • Upload date:
  • Size: 18.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for aws_bedrock_wrapper-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4c24402d816efee7ef020115b4defd8181bd4162c9c0691abe2ac15e903d5787
MD5 3c4dfe2a738774f56542f3dce82c1420
BLAKE2b-256 88bfee82a2272adc255d79a60807ef7508ecea0c670da99e26cc4e007fcad025

See more details on using hashes here.

File details

Details for the file aws_bedrock_wrapper-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for aws_bedrock_wrapper-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 30971a8fb82098a28f200409c88e66099411f24be05bb1c4f5212cfc1ff4cee8
MD5 86add30692cff04ef3bad8844e31368f
BLAKE2b-256 4a66ddc7cac24ab348424feab3c0fa1e4d3942fb239787c6abfce27b49cd9f40

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page