Skip to main content

A multiheaded modern data bridging package based on pipeline manifests to integrate between any modern (and old) data stack tools

Project description

Modern Data Integration Tool

A multiheaded modern data bridging package based on pipeline manifests to integrate between any modern (and old) data stack tools

Setup

Quick Install

python -m pip install mdit

Build from source

Clone the repository

git clone https://github.com/Broomva/mdit.git

Install the package

cd mdit && make install

Build manually

After cloning, create a virtual environment

conda create -n mdit python=3.10
conda activate mdit

Install the requirements

pip install -r requirements.txt

Run the python installation

python setup.py install

Usage

The deployment requires a .env file created under local folder:

touch .env

It should have a schema like this:

databricks_experiment_name=''
databricks_experiment_id=''
databricks_host=''
databricks_token=''
databricks_username=''
databricks_password=''
databricks_cluster_id=''
import mdit 

# Create a Snowpark session
spark = DatabricksSparkSession().get_session()

# Connect to MLFLow Artifact Server
mlflow_session = DatabricksMLFlowSession().get_session()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modern_data_integration_tool-0.1.0.tar.gz (11.8 kB view hashes)

Uploaded Source

Built Distribution

modern_data_integration_tool-0.1.0-py2.py3-none-any.whl (3.8 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page