Skip to main content

Send JSON datasets to various AWS services.

Project description

aws-json-dataset

Build codecov License: MIT Python 3.10

Lightweight and simple Python package to quickly send batches of JSON data to various AWS services.

Description

The idea behind this library is to create an easy, quick way to send JSON data to AWS services.

  • SQS
  • SNS
  • Kinesis Firehose
  • Kinesis Data Streams (coming soon)

JSON is an extremely common format and each AWS service has it's own API with different requirements for how to send data.

This library includes functionality for:

  • Automatically handling batch API calls to SNS, SQS and Kinesis Firehose
  • Manages available services based on record size
  • Base64 conversion for Kinesis streams

Roadmap

  • Support for Kinesis Data Streams
  • Support for DynamoDB inserts, updates and deletes
  • Support for S3, including gzip compression and JSON lines format
  • Support for FIFO SQS queues
  • Support for SNS topics

Quickstart

Set up your AWS credentials and environment variables and export them to the environment.

export AWS_PROFILE=<profile>
export AWS_REGION=<region>

Install the library using pip.

pip install -i https://test.pypi.org/simple/ aws-json-dataset

Send JSON data to various AWS services.

from awsjsondataset import AwsJsonDataset

# create a list of JSON objects
data = [ {"id": idx, "data": "<data>"} for idx in range(100) ]

# Wrap using AwsJsonDataset
dataset = AwsJsonDataset(data=data)

# Send to SQS queue
dataset.sqs("<sqs_queue_url>").send_messages()

# Send to SNS topic
dataset.sns("<sns_topic_arn>").publish_messages()

# Send to Kinesis Firehose stream
dataset.firehose("<delivery_stream_name>").put_records()

Local Development

Follow the steps to set up the deployment environment.

Prerequisites

  • AWS credentials
  • Python 3.10

Creating a Python Virtual Environment

When developing locally, create a Python virtual environment to manage dependencies:

python3.10 -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install .[dev,test]

Environment Variables

Create a .env file in the project root.

AWS_REGION=<region>

Important: Always use a .env file or AWS SSM Parameter Store or Secrets Manager for sensitive variables like credentials and API keys. Never hard-code them, including when developing. AWS will quarantine an account if any credentials get accidentally exposed and this will cause problems.Make sure that .env is listed in .gitignore

AWS Credentials

Valid AWS credentials must be available to AWS CLI and SAM CLI. The easiest way to do this is running aws configure, or by adding them to ~/.aws/credentials and exporting the AWS_PROFILE variable to the environment.

For more information visit the documentation page: Configuration and credential file settings

Unit Tests

Follow the steps above to create a Python virtual environment. Run tests with the following command.

coverage run -m pytest

Troubleshooting

  • Check your AWS credentials in ~/.aws/credentials
  • Check that the environment variables are available to the services that need them
  • Check that the correct environment or interpreter is being used for Python

Authors

Primary Contact: Gregory Christopher Lindsey (@chrisammon3000)

License

This library is licensed under the MIT-0 License. See the LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aws-json-dataset-0.1.0.tar.gz (50.4 kB view hashes)

Uploaded Source

Built Distribution

aws_json_dataset-0.1.0-py3-none-any.whl (12.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page