Skip to main content

Python DynamoDB interface, specialized in single-table design.

Project description

DynamoDB SingleTable

https://pypi.org/project/ddb-single/

Python DynamoDB interface, specialized in single-table design. DynamoDB is high-performance serverless NoSQL, but difficult to disign tables.

Single-table design needs only single table, and few GSIs (Global Secondary Indexes). It makes effective and easy to manage your whole data models for single service.

Getting Started

Install

pip install ddb-single

Start DynamoDB Local

docker run -d --rm -p 8000:8000 amazon/dynamodb-local

Init Table

from ddb_single import Table

table = Table(
    table_name="sample",
    endpoint_url="http://localhost:8000",
)
table.init()

Data Models

Each model has al least 3 keys

  • primary_key ... Hash key for single item. default: pk: {__model_name__}_{uuid}
  • seconday_key ... Range key for item. default: sk: {__model_name__}_item
  • unique_key ... key to identify the item is the same. Mainly used to update item.

And you can set serch_key to enable search via GSI

from ddb_single import BaseModel, DBField, FieldType

class User(BaseModel):
    __table__=table
    __model_name__ = "user"
    name = DBField(unique_key=True)
    email = DBField(search_key=True)
    age = DBField(type=FieldType.NUMBER, search_key=True)
    description=DBField()

Usage

need "Qurey" object for CRUD

  • query.model(foo).create
  • query.model(foo).get
  • query.model(foo).search
  • query.model(foo).update
  • query.model(foo).delete
from ddb_single import Query
query = Query(table)

Create Item

If the item with same value of unique_key already exist, exist item is updated.

user = User(name="John", email="john@example.com", description="test")
query.model(user).create()

Then, multible items added.

pk sk data name email description
user_xxxx user_item John john@example.com test
user_xxxx search_user_name John
user_xxxx search_user_email new-john@example.com

In addition to main item (sk=user_item), multiple item (sk=search_{__model_name__}_{field_name}) added to table. Those "search items" are used to search

The GSI DataSearchIndex is used to get "search items" to extract target's pk. Then, batch_get items by pk.

sk = hash data = range pk
search_user_name John user_xxxx
search_user_email new-john@example.com user_xxxx

Large search-key values

When the search_key field value is extremely large (for example, thousands of characters), the library stores the search metadata as multiple chunks to avoid DynamoDB index size limits. The search item keeps a hashed digest in the main search record and splits the raw value into chunk records:

pk sk data chunked chunk_count chunk_index
user_xxxx search_user_email <sha256 digest> True 3 None
user_xxxx search_user_email#chunk#0000 <first 900-byte chunk>
user_xxxx search_user_email#chunk#0001 <second 900-byte chunk>
user_xxxx search_user_email#chunk#0002 <rest of the value>

You can still search with the original value (User.email.eq(<very-long-string>)), and the library will resolve the digest internally while keeping the full value across the chunk records.

Search Items

user = query.model(User).search(User.name.eq("John"))
print(user)
# -> [{"pk":"user_xxxx", "sk":"user_item", "name":"John", "email":"john@example.com"}]

pk_only=True to extract pk without batch_get

user_pks = query.model(User).search(User.name.eq("John"), pk_only=True)
print(user_pks)
# -> ["user_xxxx"]

Get single item

get(pk) to get single item.

user = query.model(User).get("user_xxxx")
print(user)
# -> {"pk":"user_xxxx", "sk":"user_item", "name":"John", "email":"john@example.com"}

get_by_unique to get item by unique_key

user = query.model(User).get_by_unique("John")
print(user)
# -> {"pk":"user_xxxx", "sk":"user_item", "name":"John", "email":"john@example.com"}

pk_only=True option in get_by_unique to get primary key without get_item

pk = query.model(User).get_by_unique("John", pk_only=True)
print(pk)
# -> "user_xxxx"

Update Item

user = query.model(User).search(User.email.eq("john@example.com"))
new_user = User(**user[0])
new_user.email = "new-john@example.com"
query.model(new_user).update()

Or use unique value to detect exist item.

new_user = User(name="John", email="new-john@example.com")
query.model(new_user).update()

Then, tha value of "main item" and "seach item" changed

pk sk data name email description
user_xxxx user_item John new-john@example.com test
user_xxxx search_user_name John
user_xxxx search_user_email new-john@example.com

Delete Item

user = query.model(User).search(User.email.eq("new-john@example.com"))
query.model(user[0]).delete()

primary key to detect exist item.

query.model(User).delete_by_pk("user_xxxx")

or unique key

query.model(User).delete_by_unique("John")

Batch Writer

table.batch_writer() to create/update/delete multible items

  • query.model(foo).create(batch=batch)
  • query.model(foo).update(batch=batch)
  • query.model(foo).delete(batch=batch)

Batch Create

with table.batch_writer() as batch:
    for i in range(3):
        user = User(name=f"test{i}", age=i+10)
        query.model(user).create(batch=batch)
res = query.model(User).search(User.name.begins_with("test"))
print([(r["name"], r["age"]) for r in res])
# -> [("test0", 10), ("test1", 11), ("test2", 12)]

Batch Update

with table.batch_writer() as batch:
    for i in range(3):
        user = User(name=f"test{i}", age=i+20)
        query.model(user).update(batch=batch)
res = query.model(User).search(User.name.begins_with("test"))
print([(r["name"], r["age"]) for r in res])
# -> [("test0", 20), ("test1", 21), ("test2", 22)]

Batch Delete

pks = query.model(User).search(User.name.begins_with("test"), pk_only=True)
with table.batch_writer() as batch:
    for pk in pks:
        query.model(user).delete_by_pk(pk, batch=batch)
res = query.model(User).search(User.name.begins_with("test"))
print(res)
# -> []

Relationship

Create Model

You can sat relationns to other models relation=BaseModel to set relation.

class BlogPost(BaseModel):
    __model_name__ = "blogpost"
    __table__=table
    name = DBField(unique_key=True)
    content = DBField()
    author = DBField(reletion=User)

Create Item

blogpost = BlogPost(
    name="Hello",
    content="Hello world",
    author=self.user
)
query.model(blogpost).create()

Then, tha value "reletion item" added

pk sk data name author content
user_xxxx user_item John
user_xxxx search_user_name John
blogpost_xxxx blogpost_item Hello John Hello world
blogpost_xxxx search_blogpost_title Hello
blogpost_xxxx rel_user_xxxx author

In addition to main item (sk=blogpost_item), relation item (sk=rel_{primary_key}) added to table. The GSI DataSearchIndex is used to get "relation items" to extract target's pk. Then, batch_get items by pk.

sk = hash data = range pk
rel_user_xxxx author blogpost_xxxx

Search Relations

get_relation(model=Basemodel) to search relations

blogpost = query.model(BlogPost).get_by_unique("Hello")
blogpost = BlogPost(**blogpost)

user = query.model(blogpost).get_relation(model=User)
print(user)
# -> [{"pk":"user_xxxx", "sk":"user_item", "name":"John"}]

Also get_relation(field=DBField) to specify field

user = query.model(blogpost).get_relation(field=BlogPost.author)
print(user)
# -> [{"pk":"user_xxxx", "sk":"user_item", "name":"John"}]

Search Reference

In this library, "reference" is antonym to relation

get_reference(model=Basemodel) to search items related to the item

user = query.model(User).get_by_unique("John")
user = User(**blogpost)

blogpost = query.model(blogpost).get_reference(model=BlogPost)
print(blogpost)
# -> [{"pk":"blogpost_xxxx", "sk":"blogpost_item", "name":"Hello"}]

Also get_reference(field=DBField) to specify field

blogpost = query.model(user).get_reference(field=BlogPost.author)
print(blogpost)
# -> [{"pk":"blogpost_xxxx", "sk":"blogpost_item", "name":"Hello"}]

Update Relation

If relation key's value changed, relationship also changed.

new_user = User(name="Michael")
blogpost = query.model(BlogPost).get_by_unique("Hello")
blogpost["author"] = new_user
blogpost = BlogPost(**blogpost)

query.model(blogpost).update()

Then, "reletion item" changed

pk sk data name author content
user_xxxx user_item John
user_xxxx search_user_name John
user_yyyy user_item Michael
user_yyyy search_user_name Michael
blogpost_xxxx blogpost_item Hello Michael Hello world
blogpost_xxxx search_blogpost_title Hello
blogpost_xxxx rel_user_yyyy author

Delete Relation

If related item deleted, relationship also deleted

query.model(user).delete_by_unique("Michael")

Then, "reletion item" deleted. But main item's value is not chenged.

pk sk data name author content
user_xxxx user_item John
user_xxxx search_user_name John
blogpost_xxxx blogpost_item Hello Michael Hello world
blogpost_xxxx search_blogpost_title Hello

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ddb_single-0.6.1.tar.gz (18.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ddb_single-0.6.1-py3-none-any.whl (21.8 kB view details)

Uploaded Python 3

File details

Details for the file ddb_single-0.6.1.tar.gz.

File metadata

  • Download URL: ddb_single-0.6.1.tar.gz
  • Upload date:
  • Size: 18.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.12.12 Linux/6.11.0-1018-azure

File hashes

Hashes for ddb_single-0.6.1.tar.gz
Algorithm Hash digest
SHA256 457fc8d6d0b094cbff828200e0c8fd04458f3946c86fadf56e0d358460f3c551
MD5 e8d19d4fc5249ed0b127b9857b7279ad
BLAKE2b-256 fd87d2c2dca9c6c5d51dce1edb8f79c865768f505c0848412fa47d2bb9996e05

See more details on using hashes here.

File details

Details for the file ddb_single-0.6.1-py3-none-any.whl.

File metadata

  • Download URL: ddb_single-0.6.1-py3-none-any.whl
  • Upload date:
  • Size: 21.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.12.12 Linux/6.11.0-1018-azure

File hashes

Hashes for ddb_single-0.6.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e6c4ccb9215a419a236b48fe22149ec7ccd29514d2e46eb48b6c8b4b653cafaa
MD5 e3f4da19d4722b7a84a260bc5501d502
BLAKE2b-256 b8dff4d27ec0c2c1f7414bcbf86d0d2b9747a67d70ae456cb45a32c804b4af95

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page