Simple ML pipeline platform

Project description


Proof of Concept for a simple framework to create a ML pipeline.


  • Run a ML training/inference with a simple JSON configuration.
  • Modularized interfaces for task components.
  • Cache task outputs for faster experiments.

Getting started


Prerequisite: python 3.8+

# Install the core framework and standard tasks.
pip install irisml irisml-tasks irisml-tasks-training

Run an example job

# Install additional packages that are required for the example
pip install irisml-tasks-torchvision

# Run on local machine
irisml_run docs/examples/mobilenetv2_mnist_training.json

Available commands

# Run the specified pipeline. You can provide environment variables by "-e" option, which will be acceible through $env variable in the json config.
irisml_run <pipeline_json_path> [-e <ENV_NAME>=<env_value>] [--no_cache] [--no_cache_read] [-v]

# Show information about the specified task. If <task_name> is not provided, shows a list of available tasks in the current environment.
irisml_show [<task_name>]

# Manage a cache storage on Azure Blob Storage
# list - Show a list of matched blobs.
# download - Download matched blobs.
# remove - Remove matched blobs.
# show - Show the contents of matched blobs.
irisml_cache <list|download|remove|show> [--mtime <+|->N] [--name NAME]

Pipeline definition

PipelineDefinition = {"tasks": List[TaskDefinition], "on_error": Optional[List[TaskDescription]]}

TaskDefinition = {
    "task": <task module name>,
    "name": <optional unique name of the task>,
    "inputs": <list of input objects>,
    "config": <config for the task. Use irisml_show command to find the available configurations.>

In the TaskDefinition.inputs and TaskDefinition.config, you cna use the following two variable.

  • $env.<variable_name> This variable will be replaced by the environment variable that was provided as arguments for irisml_run command.
  • $outputs.<task_name>.<field_name> This variable will be replaced by the outputs of the specified previous task.

It raises an exception on runtime if the specified variable was not found.

If a task raised an exception, the tasks specified in on_error field will be executed. The exception object will be assigned to "$env.IRISML_EXCEPTION" variable.

Pipeline cache

Using cache, you can modify and re-run a pipeline config with minimum cost. If the cache is enabled, IrisML will calculate hash values for all task inputs/configs and upload the task outputs to a specified storage. When it found a task with same hash values, it can download the cache and skip the task execution.

To enable cache, you must specify the cache storage location by setting IRISML_CACHE_URL environment variable. Currently Azure Blob Storage and local filesystem is supported.

To use Azure Blob Storage, a container URL must be provided. It the URL contains a SAS token, it will be used for authentication. Otherwise, interactive authentication and Managed Identity authentication will be used.

Python API

To run a pipeline from python code, you can use the following APIs.

import json
import pathlib
from irisml.core import JobRunner

job_description = json.loads(pathlib.Path('example.json').read_text())
runner = JobRunner(job_description){'DATASET_NAME': 'mnist'}){'DATASET_NAME': 'cifar10'})

Available official tasks

To show the detailed help for each task, run the following command after installing the package.

irisml_show <task_name>


Task Description
assertion Assert the given input.
assign_class_to_strings Assigns a class to a string based on the class name being present in the string.
branch 'If' conditional branch.
calculate_cosine_similarity Calculate cosine similarity between two sets of vectors.
check_model_parameters Check Inf/NaN values in model parameters.
compare Compare two values
compare_ints Compare two int values.
convert_detection_to_multilabel Convert targets or predictions of object detection to multilabel.
convert_string_to_string_list Convert a string to a list of strings.
deserialize_tensor Deserialize a pytorch tensor.
divide_float Floating point division.
download_azure_blob Download a single blob from Azure Blob Storage.
emulate_fp8_quantization Emulate FP8 quantization.
extract_image_bytes_from_dataset Extract images from a dataset and convert them to bytes.
get_current_time Get the current time in seconds since the epoch
get_dataset_split Get a train/val split of a dataset.
get_dataset_stats Get statistics of a dataset.
get_dataset_subset Get a subset of a dataset.
get_fake_image_classification_dataset Generate a fake image classification dataset.
get_fake_image_text_classification_dataset Generate a fake image-text classification dataset.
get_fake_object_detection_dataset Generate a fake object detection dataset.
get_fake_phrase_grounding_dataset Generate a fake phrase grounding dataset.
get_fake_visual_question_answering_dataset Generate a fake visual question answering dataset.
get_int_from_json_strings Get an integer from a JSON string.
get_int_list_from_json_strings Get a list of ints from a JSON string.
get_item Get an item from the given list.
get_key_and_int_list_from_json_string Parse a JSON string and return a list of keys and a list of lists of ints.
get_kfold_cross_validation_dataset Get train/test dataset for k-fold cross validation.
get_secret_from_azure_keyvault Get a secret from Azure KeyVault.
get_topk Get the largest Topk values and indices.
join_filepath Join a given dir_path and a filename.
join_two_strings Join two strings to one string.
load_coco_detections Load coco detections from a JSON to a list of tensors.
load_float_tensor_jsonl Load a 2D float tensor from a JSONL file.
load_state_dict Load a state_dict from various sources.
load_str_list_jsonl Load a list of strings from a JSONL file.
load_strs_from_json_file Load strings from a JSON file.
load_tensor_list Load a list of tensors from file.
make_cached_dataset Save dataset cache on disk.
make_prompt_for_each_string Make a prompt for each string.
make_prompt_list_with_strings Make a list of prompts from a template and a list of strings.
make_prompt_with_strings Make a prompt with a list of strings.
make_random_choice_text_transform Make a text transform function that randomly chooses one of the substrings separated by the delimiter.
make_text_transform Make a text transform function.
map_int_list Map a list of integers to a list of integers.
pickling_object Pickling an object.
print Print or Pretty Print the input object.
print_environment_info Print various environment information to stdout/stderr.
read_file Reads a file and returns its contents as bytes.
repeat_tasks Repeat the given tasks for multiple times.
run_parallel Run the given tasks in parallel. A new process will be forked for each task. Each task must have an unique name.
run_profiler Run profiler on the given tasks.
run_sequential Run the given tasks in sequence. Each task must have an unique name.
save_file Save the given input binary to a file.
save_float_tensor_jsonl Save a 2D float tensor to a JSONL file.
save_images_from_dataset Save images from a dataset to disk.
save_jit_model Save an offline version of a pytorch model.
save_state_dict Save the model's state_dict to the specified file.
save_str_list_jsonl Save a list of strings to a JSONL file.
search_grid_sequential Grid search hyperparameters. Tasks are run in sequence.
serialize_tensor Serialize a pytorch tensor.
split_string Split string to a list of strings.
switch_pick pick from vals based on conditions. Task will return the first val with condition being True.
upload_azure_blob Upload a binary file to Azure Storage Blob.
upload_azure_blob_directory Upload a directory to Azure Blob Storage.


This package contains tasks related to pytorch training.

Task Description
append_classifier Append a classifier model to a given model. A predictor and a loss module will be added, too.
benchmark_dataset Benchmark dataset loading and preprocessing
benchmark_model Benchmark a given model using a given dataset.
benchmark_model_with_grad_cache Benchmark a given model using a given dataset with grad caching. Useful for cases which require sub batching.
build_classification_prompt_dataset Create a classification prompt dataset.
build_zero_shot_classifier Create a zero-shot classification layer.
concatenate_datasets Concatenate the given two datasets together.
convert_vqa_dataset_to_image_text_classification_dataset Convert VQA dataset to image text classification dataset.
create_classification_prompt_generator Create a prompt generator for a classification task.
create_prompt_generator Create a prompt generator that returns a list of prompts for a given label.
evaluate_accuracy Calculate accuracy of the given prediction results.
evaluate_captioning Evaluate captioning prediction results.
evaluate_detection_average_precision Calculate mean average precision for object detection task results.
evaluate_phrase_grounding Calculate precision/recall for phrase grounding.
evaluate_phrase_grounding_recall Calculate recall for phrase grounding.
evaluate_string_matching_accuracy Calculate accuracy of string matching.
exclude_negative_samples_from_classification_dataset Exclude negative samples from classification dataset.
export_coco_from_torch_dataset Export coco dataset from a given torch dataset. Support IC and OD only.
export_onnx Export the given model as ONNX.
extract_val_by_key_from_jsonl Extract value for each entry in a JSONL by a key.
find_incorrect_classification_indices Find incorrect classification indices.
find_incorrect_classification_multilabel_indices Find incorrect classification indices for multilabel classification.
flatten_captioning_dataset Flatten a captioning dataset with multiple targets per image into a dataset with a single target per image.
get_questions_from_vqa_dataset Extracts questions from a VQA dataset.
get_subclass_dataset Get the sub-dataset with given class ids from a dataset.
get_targets_from_dataset Extract only targets from a given Dataset.
load_jsonl_vqa_dataset Load a VQA dataset from a jsonl file.
load_simple_classification_dataset Load a simple classification dataset from a directory of images and an index file.
make_classification_dataset_from_object_detection Convert an object detection dataset into a classification dataset.
make_classification_dataset_from_predictions Make a classification dataset from predictions.
make_detection_dataset_from_predictions Make a detection dataset from predictions.
make_feature_extractor_model Make a wrapper model to extract a feature vector from a vision model.
make_fixed_prompt_image_transform Make a transform function for image and a fixed prompt.
make_fixed_text_dataset Create a dataset with a list of strings.
make_image_text_contrastive_model Make a model for image-text contrastive training.
make_image_text_transform Make a transform function for image-text classification.
make_oversampled_dataset Make an oversampled dataset.
make_phrase_grounding_image_transform Make phrase grounding image transform.
make_prompt_list_image_transform Make a transform function for image and prompt list.
make_vqa_collate_function Creates a collate_function for Visual Question Answering (VQA) and Phrase Grounding task.
make_vqa_image_transform Creates a transform function for VQA task.
map_classification_predictions_to_detection Map classification predictions back to detection predictions or targets.
num_iters_to_epochs Convert number of iterations to number of epochs. Min value is 1.
predict Predict using a given model.
remove_empty_images_from_dataset Remove empty images from dataset.
sample_few_shot_dataset Few-shot sampling of a IC/OD dataset.
save_jsonl_vqa_dataset Save a VQA dataset to a JSONL file.
split_image_text_model Split a image-text model into an image model and a text model.
train Train a pytorch model.
train_with_gradient_cache Train a model using gradient cache. Useful for contrastive learning with a large model.


Task Description
create_azure_computervision_caption_model Create Azure Computer Vision Caption Model.
create_azure_computervision_classification_model Create Azure Computer Vision Caption Model.
create_azure_computervision_custom_model Create a model that run inference with a custom model in Azure Computer Vision.
create_azure_computervision_ocr_model Create Azure Computer Vision OCR model.
create_azure_computervision_product_recognizer_model Create a model that run inference with a product recognizer model in Azure Computer Vision.
create_azure_computervision_vectorization_model Create Azure Computer Vision Vectorization Model.
delete_azure_computervision_custom_model Delete Azure Computer Vision Custom Model.
train_azure_computervision_custom_model Train Azure Computer Vision Custom Model.


Task Description
create_azure_customvision_docker_model Create a model from an exported Azure Custom Vision Docker image.
create_azure_customvision_model Create a prediction model from an Azure Custom Vision project.
create_azure_customvision_project Create a new Azure Custom Vision project.
delete_azure_customvision_project Delete an Azure Custom Vision project
export_azure_customvision_model Export a model from an Azure Custom Vision project.
train_azure_customvision_project Train an Azure Custom Vision project.


Task Description
call_azure_openai_completion Call Azure OpenAI Text Completion API.
create_azure_openai_chat_model Create a model that generates text using Azure OpenAI completion API.
create_azure_openai_completion_model Create a model that generates text using Azure OpenAI completion API.


Task Description
run_azureml_child Run tasks as a new child AzureML Run.


Task Description
launch_fiftyone Launch a fiftyone app.


Task Description
create_llava_model Create a LLaVA model from a pretrained weights.


Adapter tasks for OnnxRuntime library.

Task Description
benchmark_onnx Bencharmk a given onnx model using onnxruntime.
predict_onnx Predict using a given onnx model traced with the export_onnx task


Adapter for models in timm library.

Task Description
create_timm_model Create a timm model.
create_timm_transform Create timm transforms.


Adapter tasks for torchmetrics library.

Task Description
evaluate_torchmetrics_classification_multiclass Evaluate predictions results using torchmetrics classification metrics for multiclass classification problems.
evaluate_torchmetrics_classification_multilabel Evaluate predictions results using torchmetrics classification metrics for multilabel classification problems.


Adapter tasks for torchvision library.

Task Description
create_torchvision_model Create a torchvision model.
create_torchvision_transform Create transform objects in torchvision library.
create_torchvision_transform_v2 Create torchvision transform v2 object from string expressions.
load_torchvision_dataset Load a dataset from torchvision package.


Adapter tasks for HuggingFace transformers library.

Task Description
cache_transformers_model_on_azure_blob Cache a model from transformers on Azure Blob Storage.
create_transformers_model Create a model using transformers library.
create_transformers_raw_tokenizer Create a Tokenizer using transformers library. Return the tokenizer as-is.
create_transformers_text_model Create a text-generation model using transformers library.
create_transformers_tokenizer Create a Tokenizer using transformers library.


Create a new task

To create a Task, you must define a module that contains a "Task" class. Here is a simple example:

# irisml/tasks/
import dataclasses
import irisml.core

class Task(irisml.core.TaskBase):  # The class name must be "Task".
  VERSION = '1.0.0'
  CACHE_ENABLED = True  # (default: True) This is optional.

  class Inputs:  # You can remove this class if the task doesn't require inputs.
    int_value: int
    float_value: float

  class Config:  # If there is no configuration, you can remove this class. All fields must be JSON-serializable.
    another_float: float
    child_dataclass: dataclass  # If you'd like to define a nested config, you can define another dataclass.

  class Outputs:  # Can be removed if the task doesn't have outputs.
    float_value: float = 0  # If dry_run() is not implemented, Outputs fields must have default value or default factory.

  def execute(self, inputs: Inputs) -> Outputs:
    return self.Outputs(inputs.int_value * inputs.float_value * self.config.another_float)

  def dry_run(self, inputs: Inputs) -> Outputs:  # This method is optional.
    return self.Outputs(0)  # Must return immediately without actual processing.

Each Task must define "execute" method. The base class has empty implementation for Inputs, Config, Outputs and dry_run(). For the detail, please see the document for TaskBase class.

Related repositories

