An engine for running component based ML pipelines

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: Apache Software License
Operating System
- MacOS
- POSIX
- Unix
Programming Language
- Python :: 2
- Python :: 3

Project description

README

The mlcomp module is designed to process and execute 'MCenter' complex pipelines, which consists of one or more component chained together such that output of a previous component becomes the input to the next component. Each pipeline has a particular purpose, such as to train a model or generate inferences.

A single pipeline may include component from different languages, such as Python, R and Java.

How to construct a component

Steps

Create a folder, whose name corresponds to the component's name (.e.g source_string)

Create a component.json file (json format) inside this folder and make sure to fill in all the following fields:

  {
      "engineType": "Python",
      "language": "Python",
      "userStandalone": false,
      "name": "<Component name (.e.g string_source)>",
      "label": "<A lable that is displayed in the UI>",
      "version": "<Component's version (e.g. 1.0.0)>",
      "group": "<One of the valid groups (.e.g "Connectors")>,
      "program": "<The Python component main script (.e.g string_source.py)>",
      "componentClass": "<The component class name (.e.g StringSource)
      "useMLStats": <true|false - whether the components uses mlstats>,
      "inputInfo": [
          {
           "description": "<Description>",
           "label": "<Lable name>",
           "defaultComponent": "",
           "type": "<A type used to verify matching connected legs>,
           "group": "<data|model|prediction|statistics|other>"
          },
          {...}
      ],
      "outputInfo": [
          <Same as inputInfo above>
      ],
      "arguments": [
          {
              "key": "<Unique argument key name>",
              "type": "int|long|float|str|bool",
              "label": "<A lable that is displayed in the UI>",
              "description": "<Description>",
              "optional": <true|false>
          }
      ]
  }

Create the main component script, which contains the component's class name. This class should inherit from a 'Component' base class, which is taken from parallelm.components.component. The class must implement the materialize function, with this prototype: def _materialize(self, parent_data_objs, user_data). Here is a complete self contained example:
```
  from parallelm.components import ConnectableComponent
  from parallelm.mlops import mlops


  class StringSource(ConnectableComponent):
      def __init__(self, engine):
          super(self.__class__, self).__init__(engine)

      def _materialize(self, parent_data_objs, user_data):
          self._logger.info("Inside string source component")
          str_value = self._params.get('value', "default-string-value")

          mlops.set_stat("Specific stat title", 1.0)
          mlops.set_stat("Specific stat title", 2.0)

          return [str_value]
```
Notes:
- A component can use self._logger object to print logs.
- A component may access to pipeline parameters via self._params dictionary.
- The _materialize function should return a list of objects or None otherwise. This returned value will be used as an input for the next component in the pipeline chain.
Place the components main program (*.py) inside a folder along with its json description file and any other desired files.

How to construct a pipeline

Steps

Open any text editor and copy the following template:

  {
      "name": "Simple MCenter runner test",
      "engineType": "Python",
      "pipe": [
          {
              "name": "Source String",
              "id": 1,
              "type": "string-source",
              "parents": [],
              "arguments": {
                  "value": "Hello World: testing string source and sink"
              }
          },
          {
              "name": "Sink String",
              "id": 2,
              "type": "string-sink",
              "parents": [{"parent": 1, "output": 0}],
              "arguments": {
                  "expected-value": "Hello World: testing string source and sink"
              }
          }
      ]
  }

Notes:

It is assumed that you've already constructed two components whose names are: string-source and string-sink
The output of string-source component (the value returned from _materialize function) is supposed to become the input of string-sink component (an input to the _materialize function)

Save it with any desired name

How to test

Once the ml-comp python package is installed, a command line mlpiper is installed and can be used to execute the pipeline above and the components described in it.

There three main commnads that can be used as follows:

deploy - deploys a pipeline along with provided components into a given folder. Once deployed, it can also be executed directly from the given folder.
run - deploys and executes the pipeline at once.
run-deployment - executes an already deployed pipeline.

Examples:

Prepare a deployment. The resulted dirbe copied to a docker container and run there
```
mlpiper -r ~/dev/components deploy -p p1.json -d /tmp/pp
```

Deploy & Run. Usefull for development and debugging

mlpiper -r ~/dev/components run -p p1.json -d /tmp/pp

Run a deployment. Usually non interactive called by another script
```
mlpiper run-deployment --deployment-dir /tmp/pp
```

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: Apache Software License
Operating System
- MacOS
- POSIX
- Unix
Programming Language
- Python :: 2
- Python :: 3

Release history Release notifications | RSS feed

1.3.2

Nov 26, 2019

1.3.1

Jun 21, 2019

1.2.2

May 1, 2019

1.2.1

Apr 23, 2019

1.2.0

Apr 3, 2019

1.1.6

Mar 26, 2019

1.1.5

Mar 23, 2019

1.1.4

Feb 19, 2019

1.1.3

Feb 17, 2019

1.1.2

Feb 14, 2019

This version

1.1.1

Feb 14, 2019

1.1.0

Feb 14, 2019

1.0.2

Feb 14, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ml-comp-1.1.1.tar.gz (46.7 kB view details)

Uploaded Feb 14, 2019 Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ml_comp-1.1.1-py3-none-any.whl (78.9 kB view details)

Uploaded Feb 14, 2019 Python 3

ml_comp-1.1.1-py2-none-any.whl (78.9 kB view details)

Uploaded Feb 14, 2019 Python 2

File details

Details for the file ml-comp-1.1.1.tar.gz.

File metadata

Download URL: ml-comp-1.1.1.tar.gz
Upload date: Feb 14, 2019
Size: 46.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.18.4 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.5

File hashes

Hashes for ml-comp-1.1.1.tar.gz
Algorithm	Hash digest
SHA256	`a32c491b204438214f24db800b7ae42a35e1668b1786ff294a08d02f0307ef33`
MD5	`d88e75665836bb8f7023b2f2eacd7df8`
BLAKE2b-256	`63ba5a54e9b2b4bdbf6915bf78116a4077773d4958243ed9b5090cc503f45c50`

See more details on using hashes here.

File details

Details for the file ml_comp-1.1.1-py3-none-any.whl.

File metadata

Download URL: ml_comp-1.1.1-py3-none-any.whl
Upload date: Feb 14, 2019
Size: 78.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.18.4 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.5

File hashes

Hashes for ml_comp-1.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`87eef6b3560872da71a97ef29d3c398bd65374badaebfb39f8ebb607b864fe2b`
MD5	`266da77c874d2b5efb7312d8f6382929`
BLAKE2b-256	`c988c8798c1bcf2c613645dab0113093f136679fe70955619faf958a42a8ada5`

See more details on using hashes here.

File details

Details for the file ml_comp-1.1.1-py2-none-any.whl.

File metadata

Download URL: ml_comp-1.1.1-py2-none-any.whl
Upload date: Feb 14, 2019
Size: 78.9 kB
Tags: Python 2
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.18.4 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.5

File hashes

Hashes for ml_comp-1.1.1-py2-none-any.whl
Algorithm	Hash digest
SHA256	`a4e52a916aef2bf68bb5d913658bd5058757cb35a5f9649b0fb5e2f590a9ed1b`
MD5	`7f9c598a8c5f6884f4183c70f146a6d3`
BLAKE2b-256	`baab67309457808cd077ea745737e58593ecf96877238acb721325a9a541e00e`

See more details on using hashes here.

ml-comp 1.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

README

How to construct a component

Steps

How to construct a pipeline

Steps

How to test

Examples:

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes