Skip to main content

Easily load data from CSV to test out your DynamoDB table design

Project description

dynamodb-dev-importer

Easily load data from CSV to test out your DynamoDB table design.

When working with DynamoDB, it is common practice to minimise the number of tables used, ideally down to just one.

Techniques such as sparse indexes and GSI overloading allow a lot of flexibility and efficiency.

Designing a good schema that supports your query patterns can be challenging. Often it is nice to try things out with a small amount of data. I personally find it convenient to enter data into a spreadsheet and play around with it there.

This utility eases the process of populating a DynamoDB table from a CSV file, exported from a spreadsheet, that follows a specific format common to DynamoDB modelling patterns.

Install

You can install it with

$ pip3 install ddbimp

You can find the code on Github too.

Run

Assuming table people(pk:S, sk:S) is provisioned in your default region.

$ ddbimp --table people --skip 1 example.csv

Expected input format

pk sk data
PERSON-1 sales-Q1-2019 Alex jan: 12012 feb: 1927

Your spreadsheet (and exported CSV) should contain columns for:

  • pk
  • sk
  • data (optional)
  • anything after those three can contain arbitrary attributes of form attribute_name: value i.e. city: Edinburgh

Example row:

PERSON-1,sales-Q1-2019,Alex,jan: 12012,feb: 1927

Will yield an item like this:

{
    pk: 'PERSON-1',
    sk: 'sales-Q1-2019',
    data: 'Alex',
    jan: 12012,
    feb 1927
}

For a full example CSV, take a look at example.csv.

Key ideas

  • Table consists of partition key pk: S and sort key sk: S - their meaning varies depending on the item
  • A secondary index swaps the sort and partition keys, so the partition key is sk: S and sort key pk: S
  • A final secondary index uses sk: S and data: S where data is an arbitrary value you might want to search for, the meaning of data depends on the item it is part of
  • Group items through a shared partition key, store sub items with a sort key e.g.
    • e.g. pk:PERSON-1, sk:sales-Q1-2019, jan:12012, feb:1927

AWS recently released a preview build of a tool called NoSQL Workbench. It builds on the above ideas. I've tried it out and it seems pretty good, but I am a luddite and am faster working in a spreadsheet right now. I'd certainly recommend giving it a try.

Useful resources

Caveats, TODO

  • Uses your default AWS profile
  • Region needs to be set
  • Make work directly with a Google Sheets via sheets API

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ddbimp-0.3.tar.gz (4.2 kB view details)

Uploaded Source

Built Distribution

ddbimp-0.3-py3-none-any.whl (8.7 kB view details)

Uploaded Python 3

File details

Details for the file ddbimp-0.3.tar.gz.

File metadata

  • Download URL: ddbimp-0.3.tar.gz
  • Upload date:
  • Size: 4.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.8.0

File hashes

Hashes for ddbimp-0.3.tar.gz
Algorithm Hash digest
SHA256 7999d324cbb46891304f3c12de3754b14e80d1c575aa593d1245c495786631e9
MD5 078822720f5d4ec1243de06d8d22abc0
BLAKE2b-256 c9e6bf6750860185e1e4e8037e5f1b1a8aa0f63a24c058a21b2714270b3868ab

See more details on using hashes here.

File details

Details for the file ddbimp-0.3-py3-none-any.whl.

File metadata

  • Download URL: ddbimp-0.3-py3-none-any.whl
  • Upload date:
  • Size: 8.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.8.0

File hashes

Hashes for ddbimp-0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 be3e4994a9ca19b5145d269c697bb477f97f7f20c8a802ead716bd1a4d184f0c
MD5 fc30dc8238c7684de7253e014492c562
BLAKE2b-256 b5438da0155bb71f6ecf2d24177e63df261e39d4d91c52c82f532399fcfaf75c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page