Skip to main content

Migrates data pipelines between various frameworks

Project description

Databricks Workflows Migrator (wkmigrate)

PyPi package PyPi downloads

Project Description

wkmigrate is a Python library for migrating data pipelines to Databricks workflows from various frameworks. Users can programmatically create or migrate workflows with a simple set of commands.

Pipeline definitions are read from a user-specified source system, translated for compatibility with Databricks workflows, and either directly created or stored in json or yml files.

Installation

Use pip install wkmigrate to install the PyPi package.

Compatibility

wkmigrate is a standalone project. Using some features (e.g. serverless jobs compute options) may require a premium-tier Databricks workspace.

Using the Workflow Migrator

To use the wkmigrate, install the library using the %pip install wkmigrate method or install the Python wheel directly in your environment.

Once the library has been installed, create source and target definition stores for the migration.

from wkmigrate.definition_store_builder import build_definition_store

# Create the source definition store (an ADF instance):
factory_options = {
    "tenant_id": "<TENANT_ID>",
    "client_id": "<CLIENT_ID>",
    "client_secret": "<CLIENT_SECRET>",
    "subscription_id": "<SUBSCRIPTION_ID>",
    "resource_group_name": "<RESOURCE_GROUP_NAME>",
    "factory_name": "<FACTORY_NAME>"
}
factory_store = build_definition_store(
    "factory_definition_store", 
    factory_options
)

# Create the target definition store (a Databricks workspace):
workspace_options = {
    "authentication_type": "pat",
    "host_name": "<DATABRICKS_HOST_URL>",
    "pat": "<DATABRICKS_PERSONAL_ACCESS_TOKEN>",
}
workspace_store = build_definition_store(
    "workspace_definition_store", 
    workspace_options
)                        

Use the load method to get definitions from a source.

pipeline = factory_store.load(pipeline_name="<PIPELINE_NAME>")                      

Use pipeline_translator.translate() to translate definitions for compatibility with Databricks workflows.

from wkmigrate import pipeline_translator
translated_pipeline = pipeline_translator.translate(pipeline)

Use the dump method to sync workflows into a target.

workspace_store.dump(translated_pipeline)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wkmigrate-0.0.1.tar.gz (31.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wkmigrate-0.0.1-py3-none-any.whl (46.2 kB view details)

Uploaded Python 3

File details

Details for the file wkmigrate-0.0.1.tar.gz.

File metadata

  • Download URL: wkmigrate-0.0.1.tar.gz
  • Upload date:
  • Size: 31.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.3 Darwin/24.6.0

File hashes

Hashes for wkmigrate-0.0.1.tar.gz
Algorithm Hash digest
SHA256 34c43b2899b3952468415932daaba50aae8355a5938d7b6f24c5a3592b90f169
MD5 ca37b3b76e21b97cedde2ee5042be6ed
BLAKE2b-256 c58253c53e03f7503ea4d67495fdbabb085bf9d95ffb387b38d3e3dd5a531dfc

See more details on using hashes here.

File details

Details for the file wkmigrate-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: wkmigrate-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 46.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.3 Darwin/24.6.0

File hashes

Hashes for wkmigrate-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 310caefaa40b4d0a8159f733f5b7fed08011530ad7bfcf1a3975a509fcae9d26
MD5 98e695d712f7f94e3e7f299cfb4b00fe
BLAKE2b-256 da9fa4142d1be99339219cdab89c003cbd58be4cc0a303a06557e6e3ba35d0dd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page