Skip to main content

wkmigrate is a Python library for converting orchestration definitions between various systems

Project description

Databricks Workflows Migrator (wkmigrate)

PyPi package PyPi downloads

Project Description

wkmigrate is a Python library for migrating data pipelines to Databricks workflows from various frameworks. Users can programmatically create or migrate workflows with a simple set of commands.

Pipeline definitions are read from a user-specified source system, translated for compatibility with Databricks workflows, and either directly created or stored in json or yml files.

Installation

Use pip install wkmigrate to install the PyPi package.

Compatibility

wkmigrate is a standalone project. Using some features (e.g. serverless jobs compute options) may require a premium-tier Databricks workspace.

Using the Workflow Migrator

To use the wkmigrate, install the library using the %pip install wkmigrate method or install the Python wheel directly in your environment.

Once the library has been installed, create source and target definition stores for the migration.

from wkmigrate.definition_store_builder import build_definition_store

# Create the source definition store (an ADF instance):
factory_options = {
    "tenant_id": "<TENANT_ID>",
    "client_id": "<CLIENT_ID>",
    "client_secret": "<CLIENT_SECRET>",
    "subscription_id": "<SUBSCRIPTION_ID>",
    "resource_group_name": "<RESOURCE_GROUP_NAME>",
    "factory_name": "<FACTORY_NAME>"
}
factory_store = build_definition_store(
    "factory_definition_store", 
    factory_options
)

# Create the target definition store (a Databricks workspace):
workspace_options = {
    "authentication_type": "pat",
    "host_name": "<DATABRICKS_HOST_URL>",
    "pat": "<DATABRICKS_PERSONAL_ACCESS_TOKEN>",
}
workspace_store = build_definition_store(
    "workspace_definition_store", 
    workspace_options
)                        

Use the load method to get definitions from a source.

pipeline = factory_store.load(pipeline_name="<PIPELINE_NAME>")                      

Use pipeline_translator.translate() to translate definitions for compatibility with Databricks workflows.

from wkmigrate import pipeline_translator
translated_pipeline = pipeline_translator.translate(pipeline)

Use the dump method to sync workflows into a target.

workspace_store.dump(translated_pipeline)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wkmigrate-0.1.0.post1.tar.gz (40.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wkmigrate-0.1.0.post1-py3-none-any.whl (46.3 kB view details)

Uploaded Python 3

File details

Details for the file wkmigrate-0.1.0.post1.tar.gz.

File metadata

  • Download URL: wkmigrate-0.1.0.post1.tar.gz
  • Upload date:
  • Size: 40.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.3 Darwin/24.6.0

File hashes

Hashes for wkmigrate-0.1.0.post1.tar.gz
Algorithm Hash digest
SHA256 ff1be8c309940f56a43e07a908f706dc13f82b3456e2b86fe036f0f915d9d44b
MD5 c5cca5cec27ce7807d06e3d05e0c8a4a
BLAKE2b-256 6864ddb4cd3fd03daebd7462a400beb2cd8023ea29b164962eecb7adc6cce8c2

See more details on using hashes here.

File details

Details for the file wkmigrate-0.1.0.post1-py3-none-any.whl.

File metadata

  • Download URL: wkmigrate-0.1.0.post1-py3-none-any.whl
  • Upload date:
  • Size: 46.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.3 Darwin/24.6.0

File hashes

Hashes for wkmigrate-0.1.0.post1-py3-none-any.whl
Algorithm Hash digest
SHA256 062692a90876bf70724f38306363fdc22e328322f5e01cb451affc1e66b95fdd
MD5 9d7c2c444c0cda48d26a15817cc0db92
BLAKE2b-256 dab844f2d31b1738dda2c592a290604a8364cc7ce6cec27d48652cd28545e6ac

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page