Skip to main content

A package to design and run sequential ML pipelines

Project description

MLForge

|build-status| |coverage| |wheel| |documentation|

MLForge is a simple package to write simple pipelines of calls (to methods, classes, …). You can access the documentation at ReadTheDocs <https://mlforge.readthedocs.io/en/latest/>__

It surges from the need to execute several things in a row, and to be able to easily add or remove steps in the pipeline.

This is a Work in Progress.

Installation

To use MLForge, first install it using pip:

.. code:: bash

(.venv) $ pip install mlforge

Basic Usage

The general assumption is that this module will help you out in executing a pipeline of tasks. The tasks are defined in a configuration file, or within your code, and it will execute them in the order they are defined.

A Pipeline is normally created with a host object, which is an object that contains some of the methods that will be called in the pipeline, but primarily, it is used to store the results of the methods that are called. If you don’t provide a host object, the pipeline will store the results in an internal dictionary, from where you can retrieve them with get_attribute.

.. code:: python

from mlforge import Pipeline

my_stages= [ ('method1'), ('method2', {'param1': 'value2'}), ('method3', ClassName, {param1: 'value1'}), ('new_attribute', 'method4', ClassName, {'param1': 'value1'}), ] pipeline = Pipeline().from_list(my_stages) pipeline.run()

This pipeline will execute the following tasks:

  1. Call the method method1, which will be located in the host object or in globals.
  2. Call the method method2 which will be located in the host object or in globals, passing the parameter param1 with the value value2.
  3. Call the method method3 of the class ClassName, passing the parameter param1 with the value value1.
  4. Call the method method4 of the class ClassName, passing the parameter param1 with the value value1, and store the result in a new attribute new_attribute. To access the attribute you can use the method pipeline.get_attribute('new_attribute').

If you prefer to specify the stages in a separate YAML configuration file, you then can use MLForge as follows:

.. code:: python

from mlforge import Pipeline

pipeline = Pipeline().from_config('path/to/config.yaml') pipeline.run()

The configuration file is a YAML file that defines the tasks to be executed. The following is an example of YAML configuration file:

.. code:: yaml

step1: method: method class: SampleClass step2: attribute: object class: SampleClass step3: attribute: result1 method: method class: SampleClass arguments: param2: there!

For each stage of the pipeline (specified in order), you can define the method to be executed, the class that contains the method, the arguments to be passed to the method, and the attribute to store the result of the method. Method arguments can be specified as key-value pairs in the arguments section.

Alternatively, you can define the tasks in your code and execute them as follows:

.. code:: python

from mlforge import Pipeline, Stage

stage1 = Stage( attribute_name='result', method_name='my_module.my_function', arguments={'arg1': 'value1'}) stage2 = Stage( attribute_name='result2', method_name='my_module.my_function2', arguments={'arg1': 'result'})

pipeline = Pipeline().add_stages([stage1, stage2]) pipeline.run()

Syntax for the stages of the pipeline


In your code, define a list with the stages to be added to the pipeline.
Each of the stages can be specified as any of the following options:

Simply call a method of the host object:

.. code:: python

   'method_name',

Same, but put everything in a tuple

.. code:: python

   ('method_name'),

Call the constructor of a class

.. code:: python

   (ClassHolder),

Call a method of a class

.. code:: python

   ('method_name', ClassHolder),

Call a method of the host object, and keep the result in a new attribute

.. code:: python

   ('new_attribute', 'method_name'),

Call the constructor of a class, and keep the result in a new attribute

.. code:: python

   ('new_attribute', ClassHolder),

Call a method of the host object, with specific parameters, and keep the
result in a new attribute

.. code:: python

   ('new_attribute', 'method_name', {'param1': 'value1', 'param2': 'value2'}),

Call a class method, and get the result in a new attribute

.. code:: python

   ('new_attribute', 'method_name', ClassHolder),

Call a method of the host object, with specific parameters

.. code:: python

   ('method_name', {'param1': 'value1', 'param2': 'value2'}),

Call a method of a specific class, with specific parameters.

.. code:: python

   ('method_name', ClassHolder, {'param1': 'value1'}),

Call a method of a specific class, with specific parameters, and keep
the result in a new attribute

.. code:: python

   ('new_attribute', 'method_name', ClassHolder, {'param1': 'value1'}),

To do
-----

-  Add a way to add a step at a specific position
-  Add a way to remove a step
-  Add a way to replace a step
-  Add a way to add a step before or after another step
-  And many other things…


.. |build-status| image:: https://github.com/renero/mlforge/actions/workflows/python-test.yml/badge.svg
    :target: https://github.com/renero/mlforge/actions/workflows/python-test.yml
    :alt: Tests Status

.. |coverage| image:: https://codecov.io/gh/renero/mlforge/graph/badge.svg?token=HRZAE9GS0I
    :target: https://codecov.io/gh/renero/mlforge
    :alt: Code Coverage

.. |wheel| image:: https://github.com/renero/mlforge/actions/workflows/python-publish.yml/badge.svg
    :target: https://pypi.org/project/mlforge/
    :alt: PyPi Publish Status

.. |documentation| image:: https://readthedocs.org/projects/mlforge/badge/?version=latest
    :target: https://mlforge.readthedocs.io/en/latest/?badge=latest
    :alt: Documentation Status

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlforge-0.2.2.tar.gz (18.8 kB view details)

Uploaded Source

Built Distribution

mlforge-0.2.2-py3-none-any.whl (21.8 kB view details)

Uploaded Python 3

File details

Details for the file mlforge-0.2.2.tar.gz.

File metadata

  • Download URL: mlforge-0.2.2.tar.gz
  • Upload date:
  • Size: 18.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for mlforge-0.2.2.tar.gz
Algorithm Hash digest
SHA256 91d06471711d304b84ddc87ee586b90987c7f53ec504b03617b488c274efbc50
MD5 21fff88503b05685a6db7331c4275968
BLAKE2b-256 7d4ec8b6b535ecdfd180e8530ad99afe4ea16dc111607e95a89e1382f759272e

See more details on using hashes here.

File details

Details for the file mlforge-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: mlforge-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 21.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for mlforge-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 dc1ea81eeb309d45cd27b921719f0d756238e612aeb44622d368ed03636e7200
MD5 75d394b3020b26f34543f3219e4c5495
BLAKE2b-256 b9dd77a52990e0d6ac3e81cb7c70f0f93c070fb596ed88dd917396f9a460398c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page