Framework for managing migrations

Project description

TransPyData

A minimal framework for managing migrations

Overview

TransPyData implements a generic pipeline to perform migrations. It has 2 main components. First one is TransPy class, which executes the migration pipeline according to a configuration. Second the data services implementations (IDataInput, IDataProcess and IDataOutput), this services manages how data is gathered, processed and sent to the new destination.

TransPy

The TransPy class manages the migration pipeline. It needs to be provided with an instance of:

IDataInput: Manages the gathering of source data.
IDataProcess: Manages data transformation and filtering prior to pass it to the data output.
IDataOutput: Manages data sending to the new destination.

NOTE: Data services overview below

Apart from the data services there are other optional configurations:

trans_py = TransPy()

config = {
  'datainput_source': [], # If working with single record pipeline this should be an iterable of data to feed IDataInput
  'datainput_by_one': False, # Enable single record pipeline on input
  'dataprocess_by_one': False, # Enable single record pipeline on processing
  'dataoutput_by_one': False, # Enable single record pipeline on output
}
trans_py.configure(config)

The values in the snippet are the defaults, so by default the migration will move all migration data through the pipeline at once.

All processing mode

When all data services have the "by_one" flag to False the migration will move all data at once through the pipeline. So the TransPy instance will call the method get_all of IDataInput configured to get all input data, with the response will call process_all of IDataProcess, and with the response of IDataProcess will call send_all of IDataOutput. Finally a list with IDataOutput results is returned by TransPy.

Single record mode

If "by_one" flags are True the data are "queried" by one and moved through all the pipeline. The IDataOutput return are accumulated and returned as list at the end of the processing, so the TransPy return type is the same.

There are some additional cases, what if datainput and dataprocess are in "by_one" mode and dataoutput not? In this case the data is gathered and processed one by one, at the end of processing (IDataProcess) the results are accumulated and the IDataOutput is called with all data. Similar case when dataprocess and dataoutput are in "by_one" mode, data is gathered all at once and then piped one by one through IDataProcess and IDataOutput.

Data services

under construction

Getting started

To start a migration create an instance of TransPy and configure it. At least instances of IDataInput, IDataProcess and IDataOutput needs to be provided. Prior to starting the migration the data services might need to be configured too. Here is an code example:

import json

from transpydata import TransPy
from transpydata.config.datainput.MysqlDataInput import MysqlDataInput
from transpydata.config.dataprocess.NoneDataProcess import NoneDataProcess
from transpydata.config.dataoutput.RequestDataOutput import RequestDataOutput


def main():
    # Configure imput
    mysql_input = MysqlDataInput()

    config = {
        'db_config': {
            'user': 'root',
            'password': 'TryingTh1ngs',
            'host': 'localhost',
            'port': '3306',
            'database': 'migration'
        },
        'get_one_query': None, # We'll go with all query
        'get_all_query': """
            SELECT s.staff_Id, s.staff_name, s.staff_grade, m.module_Id, m.module_name
            FROM staff s
            LEFT JOIN teaches t ON s.staff_Id = t.staff_Id
            LEFT JOIN module m ON t.module_Id = m.module_Id
        """,
        'all_query_params': {} # No where clause, no interpolation
    }
    mysql_input.configure(config)

    # Configure process
    none_process = NoneDataProcess()

    # Configure output
    request_output = RequestDataOutput()
    request_output.configure({
        'url': 'http://localhost:8008',
        'req_verb': 'POST',
        'headers': {
            'content-type': 'application/json',
            'accept-encoding': 'application/json',
            'x-app-id': 'MT1'
        },
        'encode_json': True,
        'json_response': True
    })

    # Configure TransPy
    trans_py = TransPy()
    trans_py.datainput = mysql_input
    trans_py.dataprocess = none_process
    trans_py.dataoutput = request_output

    res = trans_py.run()
    print(json.dumps(res))

if __name__ == '__main__':
    main()

Full working example could be found at examples/mysql_to_http/, there is a docker-compose to launch an instance of mysql and a webserver.

Custom data services

For now you can check the interfaces IDataInput, IDataProcess and IDataOutput to see what needs to be implemented in a custom data service.

(I'll improve this section in the future)

Project details

Release history Release notifications | RSS feed

0.4.2

May 30, 2021

This version

0.4.1

May 30, 2021

0.4.0

May 30, 2021

0.3.1

Dec 27, 2020

0.3.0

Dec 19, 2020

0.2.1

Dec 19, 2020

0.2.0

Dec 9, 2020

0.1.0

Dec 9, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

transpydata-0.4.1.tar.gz (18.6 kB view details)

Uploaded May 30, 2021 Source

Built Distribution

transpydata-0.4.1-py3-none-any.whl (23.6 kB view details)

Uploaded May 30, 2021 Python 3

File details

Details for the file transpydata-0.4.1.tar.gz.

File metadata

Download URL: transpydata-0.4.1.tar.gz
Upload date: May 30, 2021
Size: 18.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/57.0.0 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.9.5

File hashes

Hashes for transpydata-0.4.1.tar.gz
Algorithm	Hash digest
SHA256	`9e1d89b7e3baa8abb519d5de216ac514ea65f8e8790d5bb7c2b462f126a4b939`
MD5	`327fc22bce817962b4a6f4e19d307ba3`
BLAKE2b-256	`9595ccd8e1071274210e38dea8e03f2125aecefd2217de6ca9b4ff7e27871c2c`

See more details on using hashes here.

File details

Details for the file transpydata-0.4.1-py3-none-any.whl.

File metadata

Download URL: transpydata-0.4.1-py3-none-any.whl
Upload date: May 30, 2021
Size: 23.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/57.0.0 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.9.5

File hashes

Hashes for transpydata-0.4.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1b49c4b7a69729b54ad351583424b19c98fc1603bc958e4d47661b0334b13f14`
MD5	`2b12e23de29804b05767f6884575da89`
BLAKE2b-256	`eae88f09d89f41bcd52b375a874f89769660e799246d453ebe64c9889f3abfce`

See more details on using hashes here.

transpydata 0.4.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

TransPyData

Overview

TransPy

All processing mode

Single record mode

Data services

Getting started

Custom data services

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes