Skip to main content

wkmigrate is a Python library for converting orchestration definitions between various systems

Project description

Databricks Workflows Migrator (wkmigrate)

PyPi package PyPi downloads

Project Description

wkmigrate is a Python library for migrating data pipelines to Databricks workflows from various frameworks. Users can programmatically create or migrate workflows with a simple set of commands.

Pipeline definitions are read from a user-specified source system, translated for compatibility with Databricks workflows, and either directly created or stored in json or yml files.

Installation

Use pip install wkmigrate to install the PyPi package.

Compatibility

wkmigrate is a standalone project. Using some features (e.g. serverless jobs compute options) may require a premium-tier Databricks workspace.

Using the Workflow Migrator

To use the wkmigrate, install the library using the %pip install wkmigrate method or install the Python wheel directly in your environment.

Once the library has been installed, create source and target definition stores for the migration.

from wkmigrate.definition_store_builder import build_definition_store

# Create the source definition store (an ADF instance):
factory_options = {
    "tenant_id": "<TENANT_ID>",
    "client_id": "<CLIENT_ID>",
    "client_secret": "<CLIENT_SECRET>",
    "subscription_id": "<SUBSCRIPTION_ID>",
    "resource_group_name": "<RESOURCE_GROUP_NAME>",
    "factory_name": "<FACTORY_NAME>"
}
factory_store = build_definition_store(
    "factory_definition_store", 
    factory_options
)

# Create the target definition store (a Databricks workspace):
workspace_options = {
    "authentication_type": "pat",
    "host_name": "<DATABRICKS_HOST_URL>",
    "pat": "<DATABRICKS_PERSONAL_ACCESS_TOKEN>",
}
workspace_store = build_definition_store(
    "workspace_definition_store", 
    workspace_options
)                        

Use the load method to get definitions from a source.

pipeline = factory_store.load(pipeline_name="<PIPELINE_NAME>")                      

Use pipeline_translator.translate() to translate definitions for compatibility with Databricks workflows.

from wkmigrate import pipeline_translator
translated_pipeline = pipeline_translator.translate(pipeline)

Use the dump method to sync workflows into a target.

workspace_store.dump(translated_pipeline)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wkmigrate-0.0.2.tar.gz (41.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wkmigrate-0.0.2-py3-none-any.whl (48.9 kB view details)

Uploaded Python 3

File details

Details for the file wkmigrate-0.0.2.tar.gz.

File metadata

  • Download URL: wkmigrate-0.0.2.tar.gz
  • Upload date:
  • Size: 41.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.3 Darwin/24.6.0

File hashes

Hashes for wkmigrate-0.0.2.tar.gz
Algorithm Hash digest
SHA256 e1431be245c752aa4bc7c8e7316eec4af8d647d37170d4d1cd30d004e15dd76c
MD5 db7d81349dc54336c36daad989aedd6c
BLAKE2b-256 0b14deaf0681adfa29c930a8de2e0a0d0c7b50cefd7bae2bee7a6ebe01f24ba2

See more details on using hashes here.

File details

Details for the file wkmigrate-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: wkmigrate-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 48.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.3 Darwin/24.6.0

File hashes

Hashes for wkmigrate-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0a3063439662908de7f546c3ad09730cb6868f63425a63472885628ec983805f
MD5 bc8e5ed168c338e66ea53f43af1e0a6a
BLAKE2b-256 3b3f384f22d733c7403d793750d26fffe081ab980dcabc36bccf4fe70994cef2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page