Skip to main content

Send JSON datasets to various AWS services.

Project description

aws-json-dataset

Build codecov License: MIT Python 3.10

Lightweight and simple Python package to quickly send batches of JSON data to various AWS services.

Description

The idea behind this library is to create an easy, quick way to send JSON data to AWS services.

  • SQS
  • SNS
  • Kinesis Firehose
  • Kinesis Data Streams (coming soon)

JSON is an extremely common format and each AWS service has it's own API with different requirements for how to send data.

This library includes functionality for:

  • Automatically handling batch API calls to SNS, SQS and Kinesis Firehose
  • Manages available services based on record size
  • Base64 conversion for Kinesis streams

Roadmap

  • Support for Kinesis Data Streams
  • Support for DynamoDB inserts, updates and deletes
  • Support for S3, including gzip compression and JSON lines format
  • Support for FIFO SQS queues
  • Support for SNS topics

Quickstart

Set up your AWS credentials and environment variables and export them to the environment.

export AWS_PROFILE=<profile>
export AWS_REGION=<region>

Install the library using pip.

pip install -i https://test.pypi.org/simple/ aws-json-dataset

Send JSON data to various AWS services.

from awsjsondataset import AwsJsonDataset

# create a list of JSON objects
data = [ {"id": idx, "data": "<data>"} for idx in range(100) ]

# Wrap using AwsJsonDataset
dataset = AwsJsonDataset(data=data)

# Send to SQS queue
dataset.sqs("<sqs_queue_url>").send_messages()

# Send to SNS topic
dataset.sns("<sns_topic_arn>").publish_messages()

# Send to Kinesis Firehose stream
dataset.firehose("<delivery_stream_name>").put_records()

Local Development

Follow the steps to set up the deployment environment.

Prerequisites

  • AWS credentials
  • Python 3.10

Creating a Python Virtual Environment

When developing locally, create a Python virtual environment to manage dependencies:

python3.10 -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install .[dev,test]

Environment Variables

Create a .env file in the project root.

AWS_REGION=<region>

Important: Always use a .env file or AWS SSM Parameter Store or Secrets Manager for sensitive variables like credentials and API keys. Never hard-code them, including when developing. AWS will quarantine an account if any credentials get accidentally exposed and this will cause problems.Make sure that .env is listed in .gitignore

AWS Credentials

Valid AWS credentials must be available to AWS CLI and SAM CLI. The easiest way to do this is running aws configure, or by adding them to ~/.aws/credentials and exporting the AWS_PROFILE variable to the environment.

For more information visit the documentation page: Configuration and credential file settings

Unit Tests

Follow the steps above to create a Python virtual environment. Run tests with the following command.

coverage run -m pytest

Troubleshooting

  • Check your AWS credentials in ~/.aws/credentials
  • Check that the environment variables are available to the services that need them
  • Check that the correct environment or interpreter is being used for Python

Authors

Primary Contact: Gregory Christopher Lindsey (@chrisammon3000)

License

This library is licensed under the MIT-0 License. See the LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aws-json-dataset-0.1.0.tar.gz (50.4 kB view details)

Uploaded Source

Built Distribution

aws_json_dataset-0.1.0-py3-none-any.whl (12.8 kB view details)

Uploaded Python 3

File details

Details for the file aws-json-dataset-0.1.0.tar.gz.

File metadata

  • Download URL: aws-json-dataset-0.1.0.tar.gz
  • Upload date:
  • Size: 50.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.13

File hashes

Hashes for aws-json-dataset-0.1.0.tar.gz
Algorithm Hash digest
SHA256 17826833609c0b978a6101be1682c17115992400f36cebd2c6d0365a1918492b
MD5 bbe8d7d8e04044e0aec2bec49e4eaeb2
BLAKE2b-256 ee9aec3b20e8429b5a52f0b54829318b040a27512f04aabf5203152843b79efd

See more details on using hashes here.

File details

Details for the file aws_json_dataset-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for aws_json_dataset-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5c7eb20585b2122485b675e2a845130efbd843b4ad2114a192bc4110b59e0c9b
MD5 2ac8459e1ec35f9c8b55cd0543bee2f6
BLAKE2b-256 efcb8fcd2319790c01947f947d98fc6714af9faace2dc453844281c042c9c13b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page