Skip to main content

A polymorphic model logging utility for MLflow that enables seamless logging of machine learning models from various libraries without requiring explicit type specification or installation of all supported frameworks. It dynamically detects installed libraries and maps model types to the appropriate MLflow logging functions.

Project description

mlflow-polylog

A polymorphic model logging utility for MLflow that enables seamless logging of machine learning models from various libraries without requiring explicit type specification or installation of all supported frameworks. It dynamically detects installed libraries and maps model types to the appropriate MLflow logging functions.

Features

  • Automatic detection of installed ML libraries (e.g., scikit-learn, XGBoost, LightGBM, etc.).
  • Type-agnostic model logging via a unified interface.
  • Extensible architecture for registering custom logging handlers.
  • Built-in support for popular frameworks like CatBoost, PyTorch, TensorFlow, and more.
  • Lazy initialization to minimize overhead.

Installation

Install the package using pip:

pip install mlflow-polylog

This package requires Python >= 3.10 and MLflow >= 2.13.0. Optional dependencies for testing include pytest and various ML libraries (e.g., catboost, lightgbm, xgboost, scikit-learn).

For development, install optional extras:

  • Testing: pip install mlflow-polylog[tests]
  • Linting: pip install mlflow-polylog[lint]
  • Typing: pip install mlflow-polylog[typing]

Quick Start

Import the main functions and start logging models within an MLflow run:

import mlflow
from mlflow_polylog import log_model

with mlflow.start_run():
    # Assume 'model' is your trained model (e.g., from scikit-learn)
    log_model(model, artifact_path="model")

Usage Examples

Using log_model

The log_model function provides an imperative interface to log models similar to MLflow's native methods. It automatically selects the appropriate logging handler based on the model's type.

import mlflow
from sklearn.linear_model import LogisticRegression
from mlflow_polylog import log_model

# Train a simple model
model = LogisticRegression()
# Assume X_train, y_train are defined
model.fit(X_train, y_train)

with mlflow.start_run():
    log_model(
        model,
        artifact_path="sklearn_model",
        input_example=X_train[:5]  # Optional: for model signature inference
    )

This logs the scikit-learn model to MLflow without needing to import or call mlflow.sklearn.log_model explicitly.

Using register_log

Register a custom logging function for a specific model type to extend the default behavior.

import mlflow
from typing import Any
from mlflow_polylog import register_log, log_model

# Define a custom log function
def custom_log_func(model: Any, artifact_path: str, **kwargs):
    # Custom logging logic, e.g., wrapping in pyfunc
    mlflow.pyfunc.log_model(artifact_path=artifact_path, python_model=model, **kwargs)

# Register it for a custom model type
class CustomModel:
    pass

register_log(CustomModel, custom_log_func)

# Now log an instance
custom_model = CustomModel()

with mlflow.start_run():
    log_model(custom_model, artifact_path="custom_model")

This adds support for CustomModel and uses the custom function when logging.

Using PolymorphicModelLog

For more advanced usage, instantiate PolymorphicModelLog directly to manage logging mappings. This is useful in production environments where you need fine-grained control.

import mlflow
from mlflow_polylog import PolymorphicModelLog, get_default_log
from mlflow_polylog.type_mapping import TypeMapping

# Get the default log mapper
default_log = get_default_log()

# Create a custom mapping (example: add a wrapper for a specific type)
custom_mapping = TypeMapping({str: lambda m, **kw: print(f"Logging string model: {m}")})

# Initialize PolymorphicModelLog with combined mappings
poly_log = PolymorphicModelLog(TypeMapping(default_log._log_map, custom_mapping))

# Use it to log
with mlflow.start_run():
    poly_log(model, artifact_path="model")  # 'model' is your trained model

You can also chain additions using add_log:

poly_log = get_default_log().add_log(CustomType, custom_log_func)
poly_log(model_of_custom_type, artifact_path="model")

Supported Libraries

The package automatically supports logging for models from the following libraries if they are installed:

  • CatBoost
  • LightGBM
  • XGBoost
  • scikit-learn
  • PyTorch
  • TensorFlow
  • fastai
  • MXNet (Gluon)
  • StatsModels
  • Prophet
  • PaddlePaddle
  • spaCy
  • H2O
  • MLflow PyFunc (including callables)

Add more via register_log as needed.

Testing

Run tests with pytest:

python -m pytest tests

Some tests are marked as slow and require optional ML library installations.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlflow_polylog-0.0.1.tar.gz (11.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlflow_polylog-0.0.1-py3-none-any.whl (10.3 kB view details)

Uploaded Python 3

File details

Details for the file mlflow_polylog-0.0.1.tar.gz.

File metadata

  • Download URL: mlflow_polylog-0.0.1.tar.gz
  • Upload date:
  • Size: 11.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.0

File hashes

Hashes for mlflow_polylog-0.0.1.tar.gz
Algorithm Hash digest
SHA256 8c601dd5aac8393100063f2b09c8fb3efec5c80b1a6da73f7162de27d9e2141d
MD5 5a589f787370f9a2a124299166f67e04
BLAKE2b-256 b4e086d66ed510d46114e653226997e348568ae751b963440225aaee35b46f41

See more details on using hashes here.

File details

Details for the file mlflow_polylog-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: mlflow_polylog-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 10.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.0

File hashes

Hashes for mlflow_polylog-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 619dd5e2dab378d7caffca001de938b74e1893614b16bbb221e169d585332a3c
MD5 48aaab5c368534fa9e4391d2abaf7866
BLAKE2b-256 dbbc57b6d2673dd54ea1ab75833b2969b78a6f2d938f367783de14d0c58b9e94

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page