Skip to main content

Migrates data pipelines between various frameworks

Project description

Databricks Workflows Migrator (wkmigrate)

PyPi package PyPi downloads

Project Description

wkmigrate is a Python library for migrating data pipelines to Databricks workflows from various frameworks. Users can programmatically create or migrate workflows with a simple set of commands.

Pipeline definitions are read from a user-specified source system, translated for compatibility with Databricks workflows, and either directly created or stored in json or yml files.

Installation

Use pip install wkmigrate to install the PyPi package.

Compatibility

wkmigrate is a standalone project. Using some features (e.g. serverless jobs compute options) may require a premium-tier Databricks workspace.

Using the Workflow Migrator

To use the wkmigrate, install the library using the %pip install wkmigrate method or install the Python wheel directly in your environment.

Once the library has been installed, create source and target definition stores for the migration.

from wkmigrate.definition_store_builder import build_definition_store

# Create the source definition store (an ADF instance):
factory_options = {
    "tenant_id": "<TENANT_ID>",
    "client_id": "<CLIENT_ID>",
    "client_secret": "<CLIENT_SECRET>",
    "subscription_id": "<SUBSCRIPTION_ID>",
    "resource_group_name": "<RESOURCE_GROUP_NAME>",
    "factory_name": "<FACTORY_NAME>"
}
factory_store = build_definition_store(
    "factory_definition_store", 
    factory_options
)

# Create the target definition store (a Databricks workspace):
workspace_options = {
    "authentication_type": "pat",
    "host_name": "<DATABRICKS_HOST_URL>",
    "pat": "<DATABRICKS_PERSONAL_ACCESS_TOKEN>",
}
workspace_store = build_definition_store(
    "workspace_definition_store", 
    workspace_options
)                        

Use the load method to get definitions from a source.

pipeline = factory_store.load(pipeline_name="<PIPELINE_NAME>")                      

Use pipeline_translator.translate() to translate definitions for compatibility with Databricks workflows.

from wkmigrate import pipeline_translator
translated_pipeline = pipeline_translator.translate(pipeline)

Use the dump method to sync workflows into a target.

workspace_store.dump(translated_pipeline)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wkmigrate-0.0.1.post1.tar.gz (31.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wkmigrate-0.0.1.post1-py3-none-any.whl (46.5 kB view details)

Uploaded Python 3

File details

Details for the file wkmigrate-0.0.1.post1.tar.gz.

File metadata

  • Download URL: wkmigrate-0.0.1.post1.tar.gz
  • Upload date:
  • Size: 31.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.3 Darwin/24.6.0

File hashes

Hashes for wkmigrate-0.0.1.post1.tar.gz
Algorithm Hash digest
SHA256 91e8a29b723a3ad2eeb35082a16d816cff60e283789e83cb4e506d95bd901426
MD5 3b47e90b488ff2f2f7cffe936cb29376
BLAKE2b-256 b9675093a2d9085357bfb634b32aeb1619a6730e32a3383cd7ee16969f8fb400

See more details on using hashes here.

File details

Details for the file wkmigrate-0.0.1.post1-py3-none-any.whl.

File metadata

  • Download URL: wkmigrate-0.0.1.post1-py3-none-any.whl
  • Upload date:
  • Size: 46.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.3 Darwin/24.6.0

File hashes

Hashes for wkmigrate-0.0.1.post1-py3-none-any.whl
Algorithm Hash digest
SHA256 9b8392ff928314c379167674d8348817363575a5a3e2666a28eeb76d351dadb2
MD5 0e0fa0380715ff5c573d5a3c2944042e
BLAKE2b-256 3aae28fd351606249b7753fa96399dcab2c32f60f2ab57edb5f26e8e1052cc04

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page