Skip to main content

Kedro plugin with Snowflake / Snowpark support

Project description

Kedro Snowflake Pipelines plugin

Python Version License SemVer PyPI version Downloads

Maintainability Rating Coverage Documentation Status

We help companies turn their data into assets

About

This plugin allows to run full Kedro pipelines in Snowflake. Right now it supports

  • Kedro starter, to get you up to speed fast
  • automatically creating Snowflake Stored Procedures from Kedro nodes (using Snowpark SDK)
  • translating Kedro pipeline into Snowflake tasks graph
  • running Kedro pipeline fully within Snowflake, without external system
  • using Kedro's official SnowparkTableDataSet
  • automatically storing intermediate data as Transient Tables (if Snowpark's DataFrames are used)
  • (New!) MLflow integration with Snowflake with example usage in Snowflights Kedro starter

Documentation

For detailed documentation refer to https://kedro-snowflake.readthedocs.io/

Usage

With starter

  1. Install the plugin

    pip install "kedro-snowflake>=0.1.0" 
    
  2. Create new project with our Kedro starter ❄️ Snowflights 🚀:

    kedro new --starter=snowflights --checkout=master
    
    And answer the interactive prompts ⬇️ (click to expand)
    Project Name
    ============
    Please enter a human readable name for your new project.
    Spaces, hyphens, and underscores are allowed.
     [Snowflights]: 
    
    Snowflake Account
    =================
    Please enter the name of your Snowflake account.
    This is the part of the URL before .snowflakecomputing.com
     []: abc-123
    
    Snowflake User
    ==============
    Please enter the name of your Snowflake user.
     []: user2137
    
    Snowflake Warehouse
    ===================
    Please enter the name of your Snowflake warehouse.
     []: compute-wh
    
    Snowflake Database
    ==================
    Please enter the name of your Snowflake database.
     [DEMO]: 
    
    Snowflake Schema
    ================
    Please enter the name of your Snowflake schema.
     [DEMO]: 
    
    Snowflake Password Environment Variable
    =======================================
    Please enter the name of the environment variable that contains your Snowflake password.
    Alternatively, you can re-configure the plugin later to use Kedros credentials.yml
     [SNOWFLAKE_PASSWORD]:       
    
    Pipeline Name Used As A Snowflake Task Prefix
    =============================================
    
     [default]:
    
    Enable Mlflow Integration (See Documentation For The Configuration Instructions)
    ================================================================================
    
     [False]: 
    
    The project name 'Snowflights' has been applied to: 
    - The project title in /tmp/snowflights/README.md
    - The folder created for your project in /tmp/snowflights
    - The project's python package in /tmp/snowflights/src/snowflights
    
  3. Run the project

    cd snowflights
    kedro snowflake run --wait-for-completion
    

In existing Kedro project

  1. Install the plugin
    pip install "kedro-snowflake>=0.1.0" 
    
  2. Initialize the plugin
    kedro snowflake init <ACCOUNT> <USER> <PASSWORD_FROM_ENV> <DATABASE> <SCHEMA> <WAREHOUSE>
    
  3. Run the project
    kedro snowflake run --wait-for-completion
    

Kedro pipeline in Snowflake Tasks

Kedro Snowflake Plugin

Execution:

Kedro Snowflake Plugin CLI

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kedro_snowflake-0.2.1.tar.gz (4.9 MB view details)

Uploaded Source

Built Distribution

kedro_snowflake-0.2.1-py3-none-any.whl (5.0 MB view details)

Uploaded Python 3

File details

Details for the file kedro_snowflake-0.2.1.tar.gz.

File metadata

  • Download URL: kedro_snowflake-0.2.1.tar.gz
  • Upload date:
  • Size: 4.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for kedro_snowflake-0.2.1.tar.gz
Algorithm Hash digest
SHA256 ad47231ed9004001738b13cbbba0012e6b8170bedacd8fd27d439d639bd95d25
MD5 ac67054db0dbbbfdf5167a6c8889062d
BLAKE2b-256 8d4b31d7f72c78a66c0c176cf6e617b1ae5696762d3856e9e5073ec21a6b77ed

See more details on using hashes here.

File details

Details for the file kedro_snowflake-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for kedro_snowflake-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7ea42950503a487fbdc070578b5f333f42dccf4ddaf2016ced676c6224af3fca
MD5 8eb969ceb2dfee260ccdafa54f34b00b
BLAKE2b-256 0d8bf5d302604a05e85d051d74ab3a759102faa10d59c0bfd772aafade9e8e9b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page