Skip to main content

wkmigrate is a Python library for converting orchestration definitions between various systems

Project description

Databricks Workflows Migrator (wkmigrate)

PyPi package PyPi downloads

Project Description

wkmigrate is a Python library for migrating data pipelines to Databricks workflows from various frameworks. Users can programmatically create or migrate workflows with a simple set of commands.

Pipeline definitions are read from a user-specified source system, translated for compatibility with Databricks workflows, and either directly created or stored in json or yml files.

Installation

Use pip install wkmigrate to install the PyPi package.

Compatibility

wkmigrate is a standalone project. Using some features (e.g. serverless jobs compute options) may require a premium-tier Databricks workspace.

Using the Workflow Migrator

To use the wkmigrate, install the library using the %pip install wkmigrate method or install the Python wheel directly in your environment.

Once the library has been installed, create source and target definition stores for the migration.

from wkmigrate.definition_store_builder import build_definition_store

# Create the source definition store (an ADF instance):
factory_options = {
    "tenant_id": "<TENANT_ID>",
    "client_id": "<CLIENT_ID>",
    "client_secret": "<CLIENT_SECRET>",
    "subscription_id": "<SUBSCRIPTION_ID>",
    "resource_group_name": "<RESOURCE_GROUP_NAME>",
    "factory_name": "<FACTORY_NAME>"
}
factory_store = build_definition_store(
    "factory_definition_store", 
    factory_options
)

# Create the target definition store (a Databricks workspace):
workspace_options = {
    "authentication_type": "pat",
    "host_name": "<DATABRICKS_HOST_URL>",
    "pat": "<DATABRICKS_PERSONAL_ACCESS_TOKEN>",
}
workspace_store = build_definition_store(
    "workspace_definition_store", 
    workspace_options
)                        

Use the load method to get definitions from a source.

pipeline = factory_store.load(pipeline_name="<PIPELINE_NAME>")                      

Use pipeline_translator.translate() to translate definitions for compatibility with Databricks workflows.

from wkmigrate import pipeline_translator
translated_pipeline = pipeline_translator.translate(pipeline)

Use the dump method to sync workflows into a target.

workspace_store.dump(translated_pipeline)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wkmigrate-0.0.2.post1.tar.gz (43.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wkmigrate-0.0.2.post1-py3-none-any.whl (50.2 kB view details)

Uploaded Python 3

File details

Details for the file wkmigrate-0.0.2.post1.tar.gz.

File metadata

  • Download URL: wkmigrate-0.0.2.post1.tar.gz
  • Upload date:
  • Size: 43.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.3 Darwin/24.6.0

File hashes

Hashes for wkmigrate-0.0.2.post1.tar.gz
Algorithm Hash digest
SHA256 0e347ac171a63d55e063ec269c1b6b8b3a2c9ec919b09198a0e46b3fef01b061
MD5 2fd1a08a73e9fea8a4bbea0535664b53
BLAKE2b-256 0323e7434a76d28c8500702af0a8eda3bca6dc4495697bebb34402d4aa4e80e4

See more details on using hashes here.

File details

Details for the file wkmigrate-0.0.2.post1-py3-none-any.whl.

File metadata

  • Download URL: wkmigrate-0.0.2.post1-py3-none-any.whl
  • Upload date:
  • Size: 50.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.3 Darwin/24.6.0

File hashes

Hashes for wkmigrate-0.0.2.post1-py3-none-any.whl
Algorithm Hash digest
SHA256 31881e1180347e41be1313af21d249c6fe3feac1e0dfcdcb60351e5ea0389d83
MD5 99eae5164f64bd9add7bd88064a8b619
BLAKE2b-256 f86207f171963f7de385a61c754d37978a1ae456c33b38647e2cc820ef13350c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page