Skip to main content

Tool for creating AWS data pipeline to export data from any DynamoDB table to an S3 bucket

Project description

A tool for creating AWS data pipeline. Currently only for exporting data from any DynamoDB table to an existing S3 bucket and store the Pipeline Definition as a backup. New features are going to be added (check for updates).

Just because terraform can’t do it yet..

Prerequisites

This tool supports only Python 3 because Python 2.7 will not be maintained past 2020

AWS-Datapipe uses awscli to configure pipelines in your AWS account. By default, installing aws-datapipe you will also install all the prerequisites, so you can skip this phase and move to the Installation step. But in case if you want to do it manually:

The easiest way to install aws-cli is to use pip in a virtualenv:

$ pip install awscli

or, if you are not installing in a virtualenv, to install globally:

$ sudo pip install awscli

or for your user:

$ pip install --user awscli

If you have the aws-cli installed and want to upgrade to the latest version you can run:

$ pip install --upgrade awscli

Because aws-datapipe is based on aws-cli, before using aws-datapipe, you need to configure your AWS credentials using aws-cli. You can do this in several ways:

  • Environment variables

  • Shared credentials file

  • Config file

  • IAM Role

The quickest way to get started is to run the aws configure command:

$ aws configure
AWS Access Key ID: foo
AWS Secret Access Key: bar
Default region name [us-west-2]: us-west-2
Default output format [None]: json

For additional info and other ways of aws-cli configuration you can check here

Installing

The easiest way to install aws-datapipe is to use pip in a virtualenv:

$ pip install aws-datapipe

or, if you are not installing in a virtualenv, to install globally:

$ sudo pip install aws-datapipe

or for your user:

$ pip install --user aws-datapipe

If you have the aws-datapipe installed and want to upgrade to the latest version you can run:

$ pip install --upgrade aws-datapipe

End with an example of getting some data out of the system or using it for a little demo

Getting Started

You can use AWS Datapipe in two different ways:

  • interactive

  • passing arguments

To list all available arguments and additional info check please the --help.

For interactive mode just start the tool aws-datapipe and it will ask you for the required information to build a pipeline.

After creating data pipeline in AWS, datapipe will save locally a pipeline definition file with the same name as your pipeline that can be used as a backup. After that you can activate the pipeline directly in the script on the next step:

Would you like to activate the pipeline now? [y/n]

You can omit activation and manually activate the pipeline later whenever you want from the AWS Console

Currently aws-datapipe is creating pipeline which will run every 14 days after activation.

Schedule configuration is coming soon.. check for updates

Versioning

I use SemVer for versioning. For the versions available, see the tags on this repository.

Author

  • Maxim Tacu - Site Reliability Engineer - OLX Group.

License

This project is licensed under the MIT License - see the LICENSE file for details

How to Contribute

Contributions are very welcome. The easiest way is to fork this repo, and then make a pull request from your fork.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aws-datapipe-0.1.3.tar.gz (5.9 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page