Skip to main content

AWS Data Pipeline Wrapper

Project description

# pline

AWS Pipeline Wrapper for boto. Construct a Data Pipeline using Python objects.

Last updated: `0.0.3`

## Installation

```
pip install pline
```

## Usage

Create a **pipeline** object

```python

pipeline = pline.Pipeline(
name = 'MyPipeline',
unique_id = 'MyPipeline1',
desc = 'An example',
region = 'us-west-2' )
```

The pipeline will connect to AWS automatically if you have set your AWS credentials at the
environmental level. If you want to connect using a specific configuration:

```python
pipeline.connect(
aws_access_key_id = 'my_access_key',
aws_secret_access_key_id = 'my_secret_key' )
```

Create a **schedule** object

```python
schedule = pline.Schedule(
name = 'Schedule',
id = 'Schedule1',
period = '1 day',
occurrences = 1 )
```

Create the **default pipeline definition**. The pipeline object has a helper-method to
create this object with sensible defaults:

```python
definition = pipeline.definition( schedule,
pipelineLogUri = "s3://bucket/pipeline/log" )
```

Create an EC2 **resource** on which the pipeline will run

```python
resource = pline.Ec2Resource(
name = 'Resource',
id = 'Resource1',
role = 'DataPipelineDefaultRole',
resourceRole = 'DataPipelineDefaultResourceRole',
schedule = schedule )
```

Create an **activity** to run

```python
activity = pline.ShellCommandActivity(
name = 'MyActivity',
id = 'MyActivity1',
runsOn = resource,
schedule = schedule,
command = 'echo hello world' )
```

Add the **schedule**, **definition**, **resource**, and **activity** to the pipeline

```python
pipeline.add(schedule, definition, resource, activity)
```

Create the pipeline in AWS

```python
pipeline.create()
```

Create a new object, asscociate it with an existing object and add it to the pipeline

```python
sns_alarm = pline.SnsAlarm(
name = 'SnsAlarm',
id = 'SnsAlarm1',
topicArn = 'arn:aws:sns:us-east-1:12345678abcd:my-arn',
role = 'DataPipelineDefaultRole' )
activity.onFailure = sns_alarm
pipeline.add(sns_alarm)
```

Update the pipeline on AWS and activate it

```python
pipeline.update()
pipeline.activate()
```

## Advanced ShellCommandActivity use

The class `ShellCommand` can be used to compose chained commands

```python
cmd = pline.ShellCommand(
'docker start registry',
'sleep 3',
'docker pull localhost:5000/my_docker',
'docker stop registry' )
# => docker start registry;\
# sleep 3;\
# docker pull localhost:5000/my_docker;\
# docker stop registry

cmd.append('echo all done')
# => docker start registry;\
# sleep 3;\
# docker pull localhost:5000/my_docker;\
# docker stop registry;\
# echo all done

activity.command = cmd
```

## Defaults

Defaults are applied as class-level attributes and are merged down the MRO line.

For example, the `S3DataNode` object inherits from: `DataNode`, and `RunnableObject`.
Each of these classes can define its own default attributes, but inherits any
higher-level attributes as well.

```python
pline.base.RunnableObject.defaults()
{ 'maximumRetries' : 2,
'retryDelay' : '10 minutes' }

pline.data_nodes.DataNode.defaults()
{ 'maximumRetries' : 2,
'retryDelay' : '10 minutes',
'scheduleType' : 'timeseries' }

pline.S3DataNode.defaults()
{ 'maximumRetries' : 2,
'retryDelay' : '10 minutes',
's3EncryptionType' : 'SERVER_SIDE_ENCRYPTION',
'scheduleType' : 'timeseries' }
```

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pline-0.0.4.tar.gz (5.6 kB view details)

Uploaded Source

File details

Details for the file pline-0.0.4.tar.gz.

File metadata

  • Download URL: pline-0.0.4.tar.gz
  • Upload date:
  • Size: 5.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for pline-0.0.4.tar.gz
Algorithm Hash digest
SHA256 56fa0b9e2e00abbdf49c0883c31f533184e990913bf186becb97caa2c17b1f81
MD5 e5c496e352b6dadfddfd46f96dc53ea3
BLAKE2b-256 5da05a2e24adfd2da6fa7a6b17bd6bd8387091ff855bf57053ecefa26ea08bf5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page