Skip to main content

cdk-databrew-cicd

Project description

cdk-databrew-cicd

This construct creates a CodePipeline pipeline where users can push a DataBrew recipe into the CodeCommit repository and the recipe will be pushed to a pre-production AWS account and a production AWS account by order automatically.

License Build Release Python pip npm version pypi evrsion Maven nuget

Table of Contents

Serverless Architecture

image Romi B. and Gaurav W., 2021

Introduction

The architecture was introduced by Romi Boimer and Gaurav Wadhawan and was posted on the AWS Blog as Set up CI/CD pipelines for AWS Glue DataBrew using AWS Developer Tools. I converted the architecture into a CDK construct for 4 programming languages. Before applying the AWS construct, make sure you've set up a proper IAM role for the pre-production and production AWS accounts. You could achieve it either by creating manually or creating through a custom construct in this library.

# Example automatically generated without compilation. See https://github.com/aws/jsii/issues/826
from cdk_databrew_cicd import IamRole

IamRole(self, "AccountIamRole",
    environment="preproduction", # or 'production'
    account_iD="ACCOUNT_ID"
)

Example

Typescript

You could also refer to here.

$ cdk --init language typescript
$ yarn add cdk-databrew-cicd
# Example automatically generated without compilation. See https://github.com/aws/jsii/issues/826
import aws_cdk.core as cdk
from cdk_databrew_cicd import DataBrewCodePipeline

class TypescriptStack(cdk.Stack):
    def __init__(self, scope, id, *, description=None, env=None, stackName=None, tags=None, synthesizer=None, terminationProtection=None, analyticsReporting=None):
        super().__init__(scope, id, description=description, env=env, stackName=stackName, tags=tags, synthesizer=synthesizer, terminationProtection=terminationProtection, analyticsReporting=analyticsReporting)

        preproduction_account_id = "PREPRODUCTION_ACCOUNT_ID"
        production_account_id = "PRODUCTION_ACCOUNT_ID"

        data_brew_pipeline = DataBrewCodePipeline(self, "DataBrewCicdPipeline",
            preproduction_iam_role_arn=f"arn:{cdk.Aws.PARTITION}:iam::{preproductionAccountId}:role/preproduction-Databrew-Cicd-Role",
            production_iam_role_arn=f"arn:{cdk.Aws.PARTITION}:iam::{productionAccountId}:role/production-Databrew-Cicd-Role"
        )

        cdk.CfnOutput(self, "OPreproductionLambdaArn", value=data_brew_pipeline.preproduction_function_arn)
        cdk.CfnOutput(self, "OProductionLambdaArn", value=data_brew_pipeline.production_function_arn)
        cdk.CfnOutput(self, "OCodeCommitRepoArn", value=data_brew_pipeline.code_commit_repo_arn)
        cdk.CfnOutput(self, "OCodePipelineArn", value=data_brew_pipeline.code_pipeline_arn)

app = cdk.App()
TypescriptStack(app, "TypescriptStack",
    stack_name="DataBrew-CICD"
)

Python

You could also refer to here.

# upgrading related Python packages
$ python -m ensurepip --upgrade
$ python -m pip install --upgrade pip
$ python -m pip install --upgrade virtualenv
# initialize a CDK Python project
$ cdk init --language python
# make packages installed locally instead of globally
$ source .venv/bin/activate
$ cat <<EOL > requirements.txt
aws-cdk.core
cdk-databrew-cicd
EOL
$ python -m pip install -r requirements.txt
from aws_cdk import core as cdk
from cdk_databrew_cicd import DataBrewCodePipeline

class PythonStack(cdk.Stack):

    def __init__(self, scope: cdk.Construct, construct_id: str, **kwargs) -> None:
        super().__init__(scope, construct_id, **kwargs)

        preproduction_account_id = "PREPRODUCTION_ACCOUNT_ID"
        production_account_id = "PRODUCTION_ACCOUNT_ID"

        databrew_pipeline = DataBrewCodePipeline(self,
        "DataBrewCicdPipeline",
        preproduction_iam_role_arn=f"arn:{cdk.Aws.PARTITION}:iam::{preproduction_account_id}:role/preproduction-Databrew-Cicd-Role",
        production_iam_role_arn=f"arn:{cdk.Aws.PARTITION}:iam::{production_account_id}:role/preproduction-Databrew-Cicd-Role",
            # bucket_name="OPTIONAL",
            # repo_name="OPTIONAL",
            # repo_name="OPTIONAL",
            # branch_namne="OPTIONAL",
            # pipeline_name="OPTIONAL"
            )

        cdk.CfnOutput(self, 'OPreproductionLambdaArn', value=databrew_pipeline.preproduction_function_arn)
        cdk.CfnOutput(self, 'OProductionLambdaArn', value=databrew_pipeline.production_function_arn)
        cdk.CfnOutput(self, 'OCodeCommitRepoArn', value=databrew_pipeline.code_commit_repo_arn)
        cdk.CfnOutput(self, 'OCodePipelineArn', value=databrew_pipeline.code_pipeline_arn)
$ deactivate

Java

You could also refer to here.

$ cdk init --language java
$ mvn package
.
.
<properties>
      <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
      <cdk.version>1.107.0</cdk.version>
      <constrcut.verion>0.1.4</constrcut.verion>
      <junit.version>5.7.1</junit.version>
</properties>
 .
 .
 <dependencies>
     <!-- AWS Cloud Development Kit -->
      <dependency>
            <groupId>software.amazon.awscdk</groupId>
            <artifactId>core</artifactId>
            <version>${cdk.version}</version>
      </dependency>
      <dependency>
        <groupId>io.github.hsiehshujeng</groupId>
        <artifactId>cdk-databrew-cicd</artifactId>
        <version>${constrcut.verion}</version>
        </dependency>
     .
     .
     .
 </dependencies>
package com.myorg;

import software.amazon.awscdk.core.CfnOutput;
import software.amazon.awscdk.core.CfnOutputProps;
import software.amazon.awscdk.core.Construct;
import software.amazon.awscdk.core.Stack;
import software.amazon.awscdk.core.StackProps;
import io.github.hsiehshujeng.cdk.databrew.cicd.DataBrewCodePipeline;
import io.github.hsiehshujeng.cdk.databrew.cicd.DataBrewCodePipelineProps;

public class JavaStack extends Stack {
    public JavaStack(final Construct scope, final String id) {
        this(scope, id, null);
    }

    public JavaStack(final Construct scope, final String id, final StackProps props) {
        super(scope, id, props);
        String preproductionAccountId = "PREPRODUCTION_ACCOUNT_ID";
        String productionAccountId = "PRODUCTION_ACCOUNT_ID";
        DataBrewCodePipeline databrewPipeline = new DataBrewCodePipeline(this, "DataBrewCicdPipeline",
                DataBrewCodePipelineProps.builder().preproductionIamRoleArn(preproductionAccountId)
                        .productionIamRoleArn(productionAccountId)
                        // .bucketName("OPTIONAL")
                        // .branchName("OPTIONAL")
                        // .pipelineName("OPTIONAL")
                        .build());

        new CfnOutput(this, "OPreproductionLambdaArn",
                CfnOutputProps.builder()
                    .value(databrewPipeline.getPreproductionFunctionArn())
                    .build());
        new CfnOutput(this, "OProductionLambdaArn",
                CfnOutputProps.builder()
                    .value(databrewPipeline.getProductionFunctionArn())
                    .build());
        new CfnOutput(this, "OCodeCommitRepoArn",
                CfnOutputProps.builder()
                    .value(databrewPipeline.getCodeCommitRepoArn())
                    .build());
        new CfnOutput(this, "OCodePipelineArn",
                CfnOutputProps.builder()
                    .value(databrewPipeline.getCodePipelineArn())
                    .build());
    }
}

C#

You could also refer to here.

$ cdk init --language csharp
$ dotnet add src/Csharp package Databrew.Cicd --version 0.1.4
using Amazon.CDK;
using ScottHsieh.Cdk;

namespace Csharp
{
    public class CsharpStack : Stack
    {
        internal CsharpStack(Construct scope, string id, IStackProps props = null) : base(scope, id, props)
        {
            var preproductionAccountId = "PREPRODUCTION_ACCOUNT_ID";
            var productionAccountId = "PRODUCTION_ACCOUNT_ID";

            var dataBrewPipeline = new DataBrewCodePipeline(this, "DataBrewCicdPipeline", new DataBrewCodePipelineProps
            {
                PreproductionIamRoleArn = $"arn:{Aws.PARTITION}:iam::{preproductionAccountId}:role/preproduction-Databrew-Cicd-Role",
                ProductionIamRoleArn = $"arn:{Aws.PARTITION}:iam::{productionAccountId}:role/preproduction-Databrew-Cicd-Role",
                // BucketName = "OPTIONAL",
                // RepoName = "OPTIONAL",
                // BranchName = "OPTIONAL",
                // PipelineName = "OPTIONAL"
            });
            new CfnOutput(this, "OPreproductionLambdaArn", new CfnOutputProps
            {
                Value = dataBrewPipeline.PreproductionFunctionArn
            });
            new CfnOutput(this, "OProductionLambdaArn", new CfnOutputProps
            {
                Value = dataBrewPipeline.ProductionFunctionArn
            });
            new CfnOutput(this, "OCodeCommitRepoArn", new CfnOutputProps
            {
                Value = dataBrewPipeline.CodeCommitRepoArn
            });
            new CfnOutput(this, "OCodePipelineArn", new CfnOutputProps
            {
                Value = dataBrewPipeline.CodeCommitRepoArn
            });
        }
    }
}

Some Efforts after Stack Creation

CodeCommit

  1. Create HTTPS Git credentials for AWS CodeCommit with an IAM user that you're going to use. image

  2. Run through the steps noted on the README.md of the CodeCommit repository after finishing establishing the stack via CDK. The returned message with success should be looked like the following (assume you have installed git-remote-codecommit):

    $ git clone codecommit://scott.codecommit@DataBrew-Recipes-Repo
    Cloning into 'DataBrew-Recipes-Repo'...
    remote: Counting objects: 6, done.
    Unpacking objects: 100% (6/6), 2.03 KiB | 138.00 KiB/s, done.
    
  3. Add a DataBrew recipe into the local repositroy (directory) and commit the change. (either directly on the main branch or merging another branch into the main branch)

Glue DataBrew

  1. Download any recipe either generated out by following Getting started with AWS Glue DataBrew or made by yourself as JSON file. image

  2. Move the recipe from the download directory to the local directory for the CodeCommit repository.

    $ mv ${DOWNLOAD_DIRECTORY}/chess-project-recipe.json ${CODECOMMIT_LOCAL_DIRECTORY}/
    
  3. Commit the change to a branch with a name you prefer.

    $ cd ${{CODECOMMIT_LOCAL_DIRECTORY}}
    $ git checkout -b add-recipe main
    $ git add .
    $ git commit -m "first recipe"
    $ git push --set-upstream origin add-recipe
    
  4. Merge the branch into the main branch. Just go to the AWS CodeCommit web console to do the merge as its process is purely the same as you've already done thousands of times on Github but only with different UIs.

How Successful Commits Look Like

  1. In the infrastructure account, the status of the CodePipeline DataBrew pipeline should be similar as the following: image
  2. In the pre-production account with the same region as where the CICD pipeline is deployed at the infrastructue account, you'll see this. image
  3. In the production account with the same region as where the CICD pipeline is deployed at the infrastructue account, you'll see this. image

Project details


Release history Release notifications | RSS feed

This version

0.1.9

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cdk_databrew_cicd-0.1.9.tar.gz (1.4 MB view details)

Uploaded Source

Built Distribution

cdk_databrew_cicd-0.1.9-py3-none-any.whl (1.4 MB view details)

Uploaded Python 3

File details

Details for the file cdk_databrew_cicd-0.1.9.tar.gz.

File metadata

  • Download URL: cdk_databrew_cicd-0.1.9.tar.gz
  • Upload date:
  • Size: 1.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.3.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.7.9

File hashes

Hashes for cdk_databrew_cicd-0.1.9.tar.gz
Algorithm Hash digest
SHA256 ff2c8fa3657777d7fac0ea7eaa26e1ea9e9fc2d3357ad8c6d4467ec66ca1e117
MD5 a4f3806040d544c2f738e155c3346ff3
BLAKE2b-256 497fe569f76385a97034ddd8f615836d104ef05d30e6ca2802fc1febd035c86c

See more details on using hashes here.

File details

Details for the file cdk_databrew_cicd-0.1.9-py3-none-any.whl.

File metadata

  • Download URL: cdk_databrew_cicd-0.1.9-py3-none-any.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.3.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.7.9

File hashes

Hashes for cdk_databrew_cicd-0.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 9cf908a6f472239777d5fce95bbb0942ae4ddb8ae2cbbb6721e07245a4fd50bd
MD5 3dcfef06ec8c9bff9cdd68b7f3feed60
BLAKE2b-256 77b599825313bea4000005960fb1c2e7ae15454858b1c2c2534124d25e6dff38

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page