A clean, simple wrapper around AWS services (S3, Textract, Bedrock)

These details have not been verified by PyPI

Project links

Project description

aws-simple

A clean, simple Python wrapper around AWS services (S3, Textract, Bedrock).

Features

Simple API: Clean, intuitive interface without exposing Boto3 complexity
Environment-based configuration: No credentials or config in code
Structured Textract output: Transforms AWS Blocks into clean, serializable JSON
Type-safe: Fully typed with Python 3.10+ support
Production-ready: Works with IAM roles, Docker, CI/CD pipelines

Installation

pip install aws-simple

Or install from source:

pip install -e .

Configuration

All configuration is done via environment variables:

# Required
export AWS_REGION=us-east-1
export AWS_S3_BUCKET=my-bucket-name

# Optional
export AWS_PROFILE=my-profile  # For local development
export AWS_BEDROCK_MODEL_ID=anthropic.claude-3-5-sonnet-20241022-v2:0
export AWS_TEXTRACT_REGION=us-east-1
export AWS_BEDROCK_REGION=us-east-1

Or use a .env file (see .env.example).

AWS Credentials

AWS credentials should be configured separately via:

IAM Role (recommended for production/EC2/ECS/Lambda)
~/.aws/credentials file (for local development)
Environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY (not recommended)

Usage

S3 Operations

from aws_simple import s3

# Upload file
s3.upload_file("document.pdf", "docs/document.pdf")

# Download file
s3.download_file("docs/document.pdf", "/tmp/document.pdf")

# Read object as bytes
content = s3.read_object("docs/document.pdf")

# List objects
files = s3.list_objects(prefix="docs/")

# Check if object exists
exists = s3.object_exists("docs/document.pdf")

Textract - Document Extraction

from aws_simple import textract
import json

# Extract from local file (with tables)
doc = textract.extract_text_from_file("invoice.pdf")

# Extract from S3 (with tables)
doc = textract.extract_text_from_s3("docs/invoice.pdf")

# Access structured data
print(doc.full_text)  # All text concatenated
print(f"Pages: {len(doc.pages)}")

# Access page details
page = doc.pages[0]
print(f"Lines: {len(page.lines)}")
print(f"Tables: {len(page.tables)}")

# Access lines
for line in page.lines:
    print(f"{line.text} (confidence: {line.confidence})")

# Access tables
for table in page.tables:
    print(f"Table: {table.rows}x{table.columns}")
    print(table.cells)  # 2D matrix of cell values

# Serialize to JSON
doc_json = doc.to_dict()
with open("result.json", "w") as f:
    json.dump(doc_json, f, indent=2)

# Simple text extraction (faster, no tables)
text = textract.extract_text_simple_from_file("document.pdf")

Textract Output Format

The library transforms AWS Textract Blocks into a clean JSON structure:

{
  "pages": [
    {
      "page_number": 1,
      "width": 1.0,
      "height": 1.0,
      "lines": [
        {
          "text": "Invoice #12345",
          "confidence": 99.5,
          "bounding_box": {"top": 0.1, "left": 0.1, "width": 0.2, "height": 0.05}
        }
      ],
      "tables": [
        {
          "rows": 3,
          "columns": 2,
          "cells": [
            ["Item", "Price"],
            ["Product A", "$10"],
            ["Product B", "$20"]
          ],
          "confidence": 98.7
        }
      ],
      "raw_text": "Invoice #12345\n..."
    }
  ],
  "full_text": "All text from all pages concatenated...",
  "metadata": {
    "document_metadata": {...},
    "total_pages": 1
  }
}

Bedrock - LLM Operations

from aws_simple import bedrock

# Simple text generation
response = bedrock.invoke("Explain AWS Lambda in one sentence")
print(response)

# With system prompt and parameters
response = bedrock.invoke(
    prompt="What are the benefits of serverless?",
    system_prompt="You are an AWS solutions architect.",
    temperature=0.7,
    max_tokens=500
)

# Request JSON output
prompt = """
List 3 AWS services with their use cases.
Format: {"services": [{"name": "...", "use_case": "..."}]}
"""
data = bedrock.invoke_json(prompt)
print(data["services"])

# Use different model
response = bedrock.invoke(
    "Summarize this text...",
    model_id="anthropic.claude-3-5-sonnet-20241022-v2:0"
)

Combined Workflow

from aws_simple import s3, textract, bedrock
import json

# 1. Upload document
s3.upload_file("invoice.pdf", "invoices/2024/inv_001.pdf")

# 2. Extract content
doc = textract.extract_text_from_s3("invoices/2024/inv_001.pdf")

# 3. Analyze with LLM
prompt = f"""
Extract key information from this invoice:

{doc.full_text}

Return JSON with: invoice_number, date, total, vendor
"""

invoice_data = bedrock.invoke_json(prompt)
print(json.dumps(invoice_data, indent=2))

Architecture

aws-simple/
├── config.py           # Environment variable configuration
├── exceptions.py       # Custom exceptions
├── _clients.py         # AWS client factory (internal)
├── s3.py              # S3 operations
├── textract.py        # Textract operations
├── bedrock.py         # Bedrock operations
├── models/            # Data models
│   └── textract.py    # TextractDocument, TextractPage, etc.
└── _parsers/          # Internal parsers
    └── textract_parser.py  # Transforms Blocks → JSON

Design Principles

No Boto3 in public API: AWS implementation details are hidden
Environment-based config: All configuration via env vars
Clean output formats: No raw AWS responses exposed
Type safety: Full type hints for better IDE support
Simple error handling: Custom exceptions for each service
Production-ready: Compatible with Docker, IAM roles, CI/CD

Exceptions

from aws_simple import (
    AWSSimpleError,          # Base exception
    ConfigurationError,      # Missing/invalid configuration
    S3Error,                 # S3 operation failures
    TextractError,          # Textract operation failures
    BedrockError,           # Bedrock operation failures
    ClientInitializationError  # AWS client init failures
)

try:
    doc = textract.extract_text_from_s3("missing.pdf")
except TextractError as e:
    print(f"Extraction failed: {e}")

Development

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Type checking
mypy src/

# Linting
ruff check src/

Requirements

Python ≥ 3.10
boto3 ≥ 1.34.0
python-dotenv ≥ 1.0.0

License

MIT

Support

For issues and feature requests, please visit the GitHub repository.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.1b2 pre-release

Dec 18, 2025

0.1.1b1 pre-release

Dec 17, 2025

This version

0.1.1b0 pre-release

Dec 17, 2025

0.1.0b0 pre-release

Dec 17, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aws_simple-0.1.1b0.tar.gz (29.5 kB view details)

Uploaded Dec 17, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

aws_simple-0.1.1b0-py3-none-any.whl (18.1 kB view details)

Uploaded Dec 17, 2025 Python 3

File details

Details for the file aws_simple-0.1.1b0.tar.gz.

File metadata

Download URL: aws_simple-0.1.1b0.tar.gz
Upload date: Dec 17, 2025
Size: 29.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for aws_simple-0.1.1b0.tar.gz
Algorithm	Hash digest
SHA256	`fe421801b948283f75dcecde22b280559b177cb1b8bec82a97fb2168ed18031f`
MD5	`c9d2ae1dc53afee979ba6b5f82e6871c`
BLAKE2b-256	`e6558fb2daa38d3992520e4a573c1159ffb5d2ee521316b8006861df6c731cae`

See more details on using hashes here.

File details

Details for the file aws_simple-0.1.1b0-py3-none-any.whl.

File metadata

Download URL: aws_simple-0.1.1b0-py3-none-any.whl
Upload date: Dec 17, 2025
Size: 18.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for aws_simple-0.1.1b0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`23c018e19946e46b28aad8c6def8c409d79f55a837038c2513e63bf74ddcda66`
MD5	`e471e99a7738f4d5ef8571b3f9a9f24e`
BLAKE2b-256	`d0998ca85c54229b168e01b4033881dc3f457f15af6f5379ede09ec5e8a3b19e`

See more details on using hashes here.

aws-simple 0.1.1b0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

aws-simple

Features

Installation

Configuration

AWS Credentials

Usage

S3 Operations

Textract - Document Extraction

Textract Output Format

Bedrock - LLM Operations

Combined Workflow

Architecture

Design Principles

Exceptions

Development

Requirements

License

Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes