Async Python wrapper for AWS Bedrock LLMs with caching, retries, and structured outputs
Project description
AWS Bedrock Wrapper
A modern, async Python wrapper for AWS Bedrock LLMs with built-in caching, structured outputs, and streaming support.
Features
- 🚀 Async/await support - Built on aioboto3 for high-performance async operations
- 💾 Smart caching - Automatic response caching for deterministic requests (temperature=0)
- 📊 Structured outputs - Type-safe responses using Pydantic models
- 🌊 Streaming - Real-time token streaming for both text and conversations
- 🎯 Multi-model support - Works with Claude, Llama, Mistral, and other Bedrock models
- 🔄 Automatic retries - Exponential backoff for transient failures
- 🔧 Simple configuration - Environment variables or explicit config
- 📝 Colored logging - Beautiful, informative console output
Installation
pip install aws-bedrock-wrapper
Quick Start
import asyncio
from aws_bedrock_wrapper import BedrockLLMClient, TextRequest
async def main():
async with BedrockLLMClient() as client:
response = await client.generate_text(TextRequest(
prompt="Explain quantum computing in one sentence",
model="anthropic.claude-3-sonnet-20240229-v1:0"
))
print(response.text)
asyncio.run(main())
Configuration
Environment Variables
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_REGION="us-east-1"
export BEDROCK_MODEL="anthropic.claude-3-sonnet-20240229-v1:0"
export BEDROCK_MAX_RETRIES="3"
export BEDROCK_RETRY_DELAY="1.0"
Explicit Configuration
from aws_bedrock_wrapper import BedrockConfig, BedrockLLMClient
config = BedrockConfig(
aws_access_key_id="your-key",
aws_secret_access_key="your-secret",
aws_region="us-east-1",
default_model="anthropic.claude-3-sonnet-20240229-v1:0",
temperature=0.7,
max_tokens=2048,
max_retries=3,
retry_delay=1.0,
max_retry_delay=60.0
)
async with BedrockLLMClient(config=config) as client:
# Use client...
pass
Usage Examples
Basic Text Generation
from aws_bedrock_wrapper import BedrockLLMClient, TextRequest
async with BedrockLLMClient() as client:
response = await client.generate_text(TextRequest(
prompt="Write a haiku about Python",
temperature=0.7,
max_tokens=100
))
print(response.text)
print(f"Tokens: {response.input_tokens} in / {response.output_tokens} out")
Structured Outputs with Pydantic
from pydantic import BaseModel, Field
from aws_bedrock_wrapper import BedrockLLMClient, TextRequest
class Recipe(BaseModel):
"""A cooking recipe"""
name: str = Field(description="Recipe name")
ingredients: list[str] = Field(description="List of ingredients")
steps: list[str] = Field(description="Cooking steps")
prep_time_minutes: int = Field(description="Preparation time")
async with BedrockLLMClient() as client:
response = await client.generate_text(TextRequest(
prompt="Give me a simple pasta recipe",
response_format=Recipe
))
recipe = response.structured_data
print(f"Recipe: {recipe.name}")
print(f"Ingredients: {', '.join(recipe.ingredients)}")
print(f"Prep time: {recipe.prep_time_minutes} minutes")
Streaming Responses
from aws_bedrock_wrapper import BedrockLLMClient, TextRequest
async with BedrockLLMClient() as client:
async for chunk in client.generate_text_stream(TextRequest(
prompt="Write a short story about a robot",
temperature=0.8
)):
print(chunk.text, end="", flush=True)
Multi-turn Conversations
from aws_bedrock_wrapper import BedrockLLMClient, MessageRequest, Message
async with BedrockLLMClient() as client:
response = await client.send_message(MessageRequest(
messages=[
Message(role="user", content="What is Python?"),
Message(role="assistant", content="Python is a programming language."),
Message(role="user", content="What are its main features?")
],
system_prompt="You are a helpful programming tutor."
))
print(response.text)
Caching
Responses are automatically cached when temperature=0 (deterministic):
# First call - hits API
response1 = await client.generate_text(TextRequest(
prompt="What is 2+2?",
temperature=0 # Enables caching
))
# Second call - instant cache hit!
response2 = await client.generate_text(TextRequest(
prompt="What is 2+2?",
temperature=0
))
# Clear cache for specific request
response3 = await client.generate_text(TextRequest(
prompt="What is 2+2?",
temperature=0,
clear_cache=True # Clears and regenerates
))
# Bypass cache
response4 = await client.generate_text(TextRequest(
prompt="What is 2+2?",
temperature=0,
use_cache=False # Skip cache lookup
))
List Available Models
from aws_bedrock_wrapper import get_available_model_ids
model_ids = await get_available_model_ids()
print(f"Available models: {model_ids}")
API Reference
BedrockLLMClient
Main client for interacting with AWS Bedrock.
Methods:
generate_text(request: TextRequest) -> TextResponse- Generate text from promptgenerate_text_stream(request: TextRequest) -> AsyncIterator[StreamChunk]- Stream text generationsend_message(request: MessageRequest) -> TextResponse- Multi-turn conversationsend_message_stream(request: MessageRequest) -> AsyncIterator[StreamChunk]- Stream conversationlist_available_models() -> List[Dict]- List available Bedrock models
TextRequest
Request parameters for text generation.
Fields:
prompt: str- Input promptmodel: Optional[str]- Model ID (uses default if not specified)temperature: Optional[float]- Sampling temperature (0.0-1.0)max_tokens: Optional[int]- Maximum tokens to generatetop_p: Optional[float]- Nucleus sampling parametertop_k: Optional[int]- Top-k sampling parametersystem_prompt: Optional[str]- System prompt for Claude modelsstream: bool- Enable streaming (default: False)response_format: Optional[Type[BaseModel]]- Pydantic model for structured outputuse_cache: bool- Use cache if available (default: True)clear_cache: bool- Clear cache before request (default: False)
TextResponse
Response from LLM.
Fields:
text: str- Generated textmodel: str- Model usedstop_reason: str- Why generation stoppedinput_tokens: int- Input token countoutput_tokens: int- Output token countmetadata: Dict[str, Any]- Additional metadatastructured_data: Optional[BaseModel]- Parsed structured output
BedrockConfig
Configuration class for AWS Bedrock client.
Parameters:
aws_access_key_id: Optional[str]- AWS access keyaws_secret_access_key: Optional[str]- AWS secret keyaws_session_token: Optional[str]- AWS session tokenaws_region: Optional[str]- AWS region (default: us-east-1)default_model: Optional[str]- Default model IDtemperature: Optional[float]- Default temperature (default: 0)max_tokens: Optional[int]- Default max tokens (default: 2048)top_p: Optional[float]- Default top_p (default: 0.9)top_k: Optional[int]- Default top_k (default: 250)max_retries: Optional[int]- Max retry attempts (default: 3)retry_delay: Optional[float]- Initial retry delay in seconds (default: 1.0)max_retry_delay: Optional[float]- Max retry delay in seconds (default: 60.0)
Supported Models
- Anthropic Claude - Claude 3 (Opus, Sonnet, Haiku), Claude 2
- Meta Llama - Llama 2, Llama 3
- Mistral AI - Mistral 7B, Mixtral
- Amazon Titan - Titan Text models
Development
Setup
git clone https://github.com/yourusername/aws-bedrock-wrapper.git
cd aws-bedrock-wrapper
pip install -e ".[dev]"
Run Tests
python examples/test_wrapper.py
python examples/test_structured.py
python examples/test_cache.py
License
MIT License - see LICENSE file for details.
Contributing
Contributions welcome! Please open an issue or PR.
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aws_bedrock_wrapper-0.1.0.tar.gz.
File metadata
- Download URL: aws_bedrock_wrapper-0.1.0.tar.gz
- Upload date:
- Size: 18.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4c24402d816efee7ef020115b4defd8181bd4162c9c0691abe2ac15e903d5787
|
|
| MD5 |
3c4dfe2a738774f56542f3dce82c1420
|
|
| BLAKE2b-256 |
88bfee82a2272adc255d79a60807ef7508ecea0c670da99e26cc4e007fcad025
|
File details
Details for the file aws_bedrock_wrapper-0.1.0-py3-none-any.whl.
File metadata
- Download URL: aws_bedrock_wrapper-0.1.0-py3-none-any.whl
- Upload date:
- Size: 17.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
30971a8fb82098a28f200409c88e66099411f24be05bb1c4f5212cfc1ff4cee8
|
|
| MD5 |
86add30692cff04ef3bad8844e31368f
|
|
| BLAKE2b-256 |
4a66ddc7cac24ab348424feab3c0fa1e4d3942fb239787c6abfce27b49cd9f40
|