Skip to main content

A simple boto3/Pydantic implementation of DynamoDB Single Table Design and related utilities.

Project description

Simple Single Table

Latest Version: 16.5.0

Project Overview

simplesingletable is a Python library providing an abstraction layer for AWS DynamoDB operations, specifically designed for single-table design patterns. The library uses Pydantic for model definitions and includes " batteries-included" functionality for common DynamoDB use cases.

Target Use Cases

  • Small to medium scale applications
  • Single-table DynamoDB design patterns
  • Applications requiring versioned resources with automatic history tracking
  • Fast, consistent, and cost-effective storage solutions

Key Features

  1. Single Table Design: Store different object types in a single DynamoDB table
  2. Automatic ID Generation: Uses lexicographically sortable IDs via ulid-py for chronological ordering
  3. Resource Versioning: Automatic versioning with complete history and optimistic concurrency control
  4. Secondary Access Patterns: Support for GSI-based queries and filtering
  5. Pydantic Integration: Type-safe models with validation
  6. Compression Support: Optional gzip compression for large data

Architecture

Core Components

Models (src/simplesingletable/models.py)

  • BaseDynamoDbResource: Abstract base class for all resources
  • DynamoDbResource: Non-versioned resources (simpler, lighter)
  • DynamoDbVersionedResource: Versioned resources with automatic history tracking
  • PaginatedList: Enhanced list for paginated query results

Memory Layer (src/simplesingletable/dynamodb_memory.py)

  • DynamoDbMemory: Main interface for DynamoDB operations (CRUD, queries, filtering)

Utilities (src/simplesingletable/utils.py)

  • ID generation, pagination helpers, DynamoDB type marshalling

Resource Types

Non-Versioned Resources (DynamoDbResource):

  • Lighter weight, direct updates
  • No version history
  • Fields: resource_id, created_at, updated_at

Versioned Resources (DynamoDbVersionedResource):

  • Complete version history
  • Optimistic concurrency control
  • Version limit enforcement (configurable via max_versions)
  • Compressed data storage by default
  • Fields: resource_id, version, created_at, updated_at

GSI (Global Secondary Index) Configuration

The library supports both static and dynamic GSI configuration:

class MyResource(DynamoDbVersionedResource):
    # Static configuration
    gsi_config: ClassVar[Dict[str, IndexFieldConfig]] = {
        "status": {
            "pk": lambda self: f"status#{self.status}",
            "sk": lambda self: self.resource_id,  # Sort by creation time
        }
    }

    # Or dynamic configuration via get_gsi_config()
    @classmethod
    def get_gsi_config(cls) -> Dict[str, IndexFieldConfig]:
        return {"dynamic_index": {"pk": lambda self: f"type#{self.type}"}}

Extras Module (src/simplesingletable/extras/)

Additional patterns and utilities:

  • Repository Pattern: ResourceRepository for higher-level operations
  • Versioned Repository: VersionedResourceRepository with version management
  • Singleton Pattern: For configuration resources
  • Form Data: Streamlit integration helpers
  • Habit Tracker: Example application

Development Setup

Dependencies

Core Dependencies:

  • boto3 - AWS SDK
  • pydantic>2 - Data validation and serialization
  • ulid-py - Unique ID generation
  • humanize - Human-readable formatting

Development Dependencies:

  • pytest + pytest-cov + pytest-docker - Testing framework
  • black + isort + ruff - Code formatting and linting
  • invoke - Task automation
  • bumpver - Version management

Build and Test Commands

The project uses invoke for task automation (tasks.py):

# Dependency management
inv compile-requirements          # Compile requirements.txt from pyproject.toml
inv compile-requirements --upgrade # Update dependencies

# Development
inv lint                         # Format and lint code (black, isort, ruff)
inv launch-dynamodb-local       # Start local DynamoDB for testing
inv halt-dynamodb-local         # Stop local DynamoDB

# Testing
pytest                          # Run test suite with coverage
pytest tests/test_specific.py   # Run specific tests

# Release management
inv bumpver --patch|minor|major # Bump version
inv build                       # Build distribution packages
inv fullrelease --patch         # Complete release cycle (lint, test, bump, build, publish)

Testing Architecture

  • Docker-based: Uses DynamoDB Local via Docker Compose
  • Pytest fixtures: conftest.py provides database setup/teardown
  • Comprehensive coverage: Tests for CRUD, versioning, GSI, filtering
  • Test files:
    • test_simplesingletable.py - Core functionality
    • test_versioned_repository.py - Version management
    • test_repository.py - Repository pattern
    • test_filter_expressions.py - Query filtering

Important Patterns and Conventions

Resource Definition Pattern

from simplesingletable import DynamoDbVersionedResource


class MyResource(DynamoDbVersionedResource):
    name: str
    status: str
    metadata: Optional[dict] = None
    
    # the following are defined on the base class
    # resource_id: str
    # version: int
    # created_at: datetime
    # updated_at: datetime

    # Optional: Configure compression and version limits
    resource_config: ClassVar[ResourceConfig] = ResourceConfig(
        compress_data=True,  # Default for versioned
        max_versions=10  # Keep only 10 versions
    )

    # Optional: GSI configuration for secondary access patterns
    def db_get_gsi1pk(self) -> str | None:
        return f"status#{self.status}"

CRUD Operations Pattern

from simplesingletable import DynamoDbMemory

memory = DynamoDbMemory(logger=logger, table_name="my-table")

# Create
resource = memory.create_new(MyResource, {"name": "test", "status": "active"})

# Read
retrieved = memory.read_existing(resource.resource_id, MyResource)

# Update (versioned resources automatically increment version)
updated = memory.update_existing(retrieved, {"status": "inactive"})

# List with filtering
resources = memory.list_resources(
    MyResource,
    filter_fn=lambda r: r.status == "active",
    limit=50
)

Paginated Queries

The library provides powerful paginated query capabilities for efficient data retrieval from DynamoDB, supporting both primary key and GSI-based queries with filtering.

Basic Paginated Query

from simplesingletable import exhaust_pagination

# Direct paginated query with GSI
results = memory.paginated_dynamodb_query(
    resource_class=MyResource,
    index_name="gsi1",
    key_condition=Key("gsi1pk").eq("status#active"),
    results_limit=100,  # Items per page
    pagination_key=None  # For first page
)

# Get all results (automatically handles pagination)
all_results = []
for page in exhaust_pagination(
    lambda pk=None: memory.paginated_dynamodb_query(
        resource_class=MyResource,
        index_name="gsi1", 
        key_condition=Key("gsi1pk").eq("status#active"),
        results_limit=100,
        pagination_key=pk
    )
):
    all_results.extend(page)

Query Patterns with GSI

Define query methods on your resource classes for reusable access patterns:

class MyResource(DynamoDbVersionedResource):
    status: str
    category: str
    
    def db_get_gsi1pk(self) -> str | None:
        # Sparse GSI - only index active items
        if self.status == "active":
            return f"{self.get_unique_key_prefix()}#{self.category}"
        return None
    
    @classmethod
    def query_by_category_kwargs(cls, category: str):
        """Build query kwargs for category-based queries."""
        return {
            "index_name": "gsi1",
            "key_condition": Key("gsi1pk").eq(f"{cls.get_unique_key_prefix()}#{category}")
        }

# Use the query pattern
active_in_category = memory.paginated_dynamodb_query(
    resource_class=MyResource,
    **MyResource.query_by_category_kwargs("important"),
    results_limit=50
)

Advanced Filtering

Important Note on Versioned Resources: Versioned resources use compression by default, which means DynamoDB filter expressions can only access attributes that are part of a GSI (pk, sk, gsi1pk, gsi1sk, etc.). For compressed resources, attribute-based filtering must be done client-side. Non-versioned resources don't have this limitation.

from boto3.dynamodb.conditions import Attr

# For NON-VERSIONED resources - DynamoDB-side filtering works
filter_expression = Attr("status").eq("active") & Attr("priority").gt(5)

# For VERSIONED resources - use client-side filtering or GSI design
def custom_filter(resource: MyVersionedResource) -> bool:
    return resource.status == "active" and resource.priority > 5

# Combined approach (versioned resource example)
results = memory.paginated_dynamodb_query(
    resource_class=MyVersionedResource,
    filter_fn=custom_filter,  # Client-side only for versioned
    results_limit=100
)

Best Practices for Efficient Filtering:

  1. Design GSIs carefully - For versioned resources, encode filterable attributes in GSI keys
  2. Consider data volume - Client-side filtering is acceptable for small result sets (< 1000 items)
  3. Use list_type_by_updated_at - Efficient for time-based queries with limited client-side filtering
  4. Non-versioned for high-volume filtering - Consider using non-versioned resources when you need extensive DynamoDB-side filtering

Pagination Handling

For UI-driven pagination or batch processing:

# First page
page1 = memory.paginated_dynamodb_query(
    resource_class=MyResource,
    results_limit=20
)

# Get pagination key for next page
if page1.has_more:
    next_key = page1.pagination_key
    
    # Fetch next page
    page2 = memory.paginated_dynamodb_query(
        resource_class=MyResource,
        results_limit=20,
        pagination_key=next_key
    )

Query by Updated Time

Special helper for time-based queries:

# List resources by most recently updated
recent = memory.list_type_by_updated_at(
    MyResource,
    results_limit=50,
    filter_expression=Attr("status").eq("pending")
)

Version Management

# Get all versions of a resource
versions = memory.get_resource_versions(resource_id, MyResource)

# Get specific version
v2 = memory.read_existing_version(resource_id, MyResource, version=2)

# Version limits automatically enforced during updates
# (configure via resource_config['max_versions'])

Code Quality Standards

  • Line length: 120 characters
  • Python version: ≥3.10
  • Type hints: Required for all public APIs
  • Documentation: Docstrings for classes and complex methods
  • Testing: High coverage requirements (see pyproject.toml)

Key Files and Their Purposes

  • src/simplesingletable/__init__.py - Public API exports
  • src/simplesingletable/models.py - Core resource classes and types
  • src/simplesingletable/dynamodb_memory.py - Main DynamoDB interface
  • src/simplesingletable/utils.py - Utility functions
  • src/simplesingletable/extras/ - Additional patterns and examples
  • tests/ - Comprehensive test suite
  • tasks.py - Development automation scripts
  • pyproject.toml - Project configuration and dependencies

Common Gotchas and Considerations

  1. Versioned vs Non-versioned: Choose based on whether you need history tracking
  2. Compression: Versioned resources use compression by default; configure via resource_config
  3. GSI Limits: DynamoDB has GSI limits; design access patterns carefully
  4. Version Limits: Set max_versions to prevent unbounded growth
  5. Pagination: Use exhaust_pagination() for complete result sets
  6. Concurrency: Versioned resources prevent concurrent updates from same version

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

simplesingletable-16.5.0.tar.gz (153.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

simplesingletable-16.5.0-py3-none-any.whl (83.7 kB view details)

Uploaded Python 3

File details

Details for the file simplesingletable-16.5.0.tar.gz.

File metadata

  • Download URL: simplesingletable-16.5.0.tar.gz
  • Upload date:
  • Size: 153.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for simplesingletable-16.5.0.tar.gz
Algorithm Hash digest
SHA256 ac833a953f8d183097885027a898c580504b64f3eb689a3f84c84fb0ac5cf377
MD5 5b6e0ef637b46f1a4c048bce59a43a22
BLAKE2b-256 09eef66c4dd657314e6ab9f9b975b6254728f787a4ed731f520aeb48e8c15f2e

See more details on using hashes here.

File details

Details for the file simplesingletable-16.5.0-py3-none-any.whl.

File metadata

File hashes

Hashes for simplesingletable-16.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d825f60923af10726639af64344fd4215539ade116e684c8a0d5fb11f53f7fb0
MD5 b9872da7e9242044ef1c3c27c84133c5
BLAKE2b-256 6b50379678e984c5920c4ed2d05e3d4d2d35ac93e826d7115d92188975bacf11

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page