a unified mlflow wrapper for standardized ML model logging and registration.

Project description

mlops_wrapper

schema-agnostic mlflow wrapper for consistent experiment tracking across ML projects.

what is this?

a python wrapper around mlflow that handles:

run lifecycle (start, log, end)
autolog orchestration with intelligent fallback
framework-agnostic model logging (catboost, sklearn, xgboost, lightgbm, pytorch, tensorflow)
automatic signature inference with type conversion for robust serving
collision detection for params/metrics
error handling and tagging
AWS credentials setup for S3 artifact storage

core principle: wrapper is generic. your project builds metadata (tags, run names) from its config and passes them in.

why use this?

without wrapper:

# 30+ lines per training script
mlflow.set_tracking_uri(...)
mlflow.set_experiment(...)
mlflow.start_run(run_name=...)
mlflow.set_tag("tag1", value1)
mlflow.set_tag("tag2", value2)
# ... 20 more tags
mlflow.log_param("param1", value1)
# ... handle autolog failures
# ... detect collisions
# ... error handling
mlflow.end_run()

with wrapper:

@mlflow_experiment(
    experiment_name="my-project",
    run_name="exp1_catboost_20251225_143052",
    tags={...},  # project builds these
)
def train():
    return {'params': {...}, 'metrics': {...}}

design principles

schema-agnostic: no hardcoded assumptions about your config structure
metadata comes from project: wrapper accepts pre-built tags/run_name
autolog with fallback: tries autolog first, falls back to manual if it fails
collision detection: warns if your metrics clash with autolog
error transparency: logs failures as tags, doesn't swallow errors

responsibilities

wrapper handles:

mlflow lifecycle (start_run, end_run)
autolog enablement (before start_run per mlflow docs)
autolog failure detection (0 params logged = failure)
param/metric collision detection and warnings
error logging as tags (mlops.status, mlops.error_type)
AWS credentials setup (boto3 session from AWS_PROFILE)

your project handles:

building run names from your config schema
extracting tags from your domain (cost filters, data splits, etc.)
extracting model hyperparameters manually when autolog fails

installation

pip install mlops-wrapper

or for development:

pip install -e /path/to/mlops_wrapper_repo

usage

basic example (sklearn - autolog works)

from mlops_wrapper.mlflow_wrapper import mlflow_experiment

@mlflow_experiment(
    experiment_name="sklearn-test",
    run_name="rf_experiment_20251225",
    tags={'model': 'random_forest', 'team': 'ml-ops'},
    autolog_enabled=True
)
def train():
    from sklearn.ensemble import RandomForestClassifier
    from sklearn.datasets import make_classification

    X, y = make_classification(n_samples=100, n_features=20)
    model = RandomForestClassifier(n_estimators=100, max_depth=5)
    model.fit(X, y)

    # autolog captures params automatically
    # return custom metrics
    return {
        'metrics': {'custom_score': 0.95}
    }

train()

catboost example (autolog fails - manual fallback)

@mlflow_experiment(
    experiment_name="catboost-test",
    run_name="cb_experiment_20251225",
    tags={'model': 'catboost', 'task': 'regression'},
    autolog_enabled=True  # will fail, wrapper detects and requires manual params
)
def train():
    from catboost import CatBoostRegressor
    import numpy as np

    X = np.random.rand(100, 10)
    y = np.random.rand(100)

    model = CatBoostRegressor(iterations=10, depth=3, verbose=False)
    model.fit(X, y)

    # autolog will capture 0 params (known mlflow issue #6073)
    # wrapper detects this and requires manual params
    return {
        'params': {  # required when autolog fails
            'model.iterations': 10,
            'model.depth': 3,
            'n_features': 10
        },
        'metrics': {
            'train_rmse': 0.15
        }
    }

train()

what happens:

warning: autolog enabled but captured 0 params
  this typically means:
    - framework has no autolog support (e.g., catboost)
    - model was not trained within the mlflow run context
  falling back to manual param logging

model logging example

@mlflow_experiment(
    experiment_name="catboost-model-logging",
    run_name="cb_with_model_20260128",
    tags={'model': 'catboost'},
    log_model=True  # enable model artifact logging (default: True)
)
def train():
    from catboost import CatBoostRegressor
    import pandas as pd
    import numpy as np

    # prepare data
    X_train = pd.DataFrame(np.random.rand(100, 10), columns=[f'f{i}' for i in range(10)])
    y_train = np.random.rand(100)

    model = CatBoostRegressor(iterations=50, depth=4, verbose=False)
    model.fit(X_train, y_train)

    predictions = model.predict(X_train)

    # return model with inputs for signature inference
    return {
        'params': {'iterations': 50, 'depth': 4},
        'metrics': {'train_rmse': 0.12},
        'model': model,  # triggers automatic framework detection and logging
        'model_inputs': X_train.head(5),  # for signature inference
        'model_predictions': predictions[:5]  # for signature output schema
    }

train()

what happens:

wrapper detects framework (catboost)
calls mlflow.catboost.log_model() with proper signature
converts int64 features to double (prevents serving errors with boolean features)
logs input_example for inference validation
adds tags: mlops.model_framework, mlops.model_logged_via

model logging control:

# local development - skip model logging to save storage
@mlflow_experiment(..., log_model=False)

# or via environment variable
export MLFLOW_LOG_MODEL=false  # overrides decorator parameter

important: to avoid duplicate model logging, disable autolog's model logging:

@mlflow_experiment(
    ...,
    autolog_config={'log_models': False}  # let explicit logging handle it
)

project-level integration pattern

# in your project's training/train.py

from mlflow_integration import build_run_name, build_tags  # your helpers
from mlops_wrapper.mlflow_wrapper import mlflow_experiment

def main():
    config = load_configs()

    # project builds metadata from config
    tags = build_tags(config)  # extracts 21+ domain-specific tags
    run_name = build_run_name(
        model_type=config['model']['type'],
        filters=config['data']['filters'],
        timestamp='20251225_143052'
    )

    # pass pre-built metadata to wrapper
    wrapped_train = mlflow_experiment(
        experiment_name="cost-prediction",
        run_name=run_name,
        tags=tags,
        tracking_uri="https://mlflow.example.com",
        autolog_enabled=True
    )(train_pipeline)

    wrapped_train(config)

autolog behavior

autolog enabled (default):

wrapper calls mlflow.autolog() before start_run()
after training, checks if params were captured
if 0 params logged: warns and requires manual params in return dict
if params logged: autolog worked, custom params/metrics optional

supported frameworks:

sklearn: ✅ works
xgboost: ✅ works
lightgbm: ✅ works
pytorch: ✅ works
tensorflow: ✅ works
catboost: ❌ broken (mlflow issue #6073) - manual fallback required

collision detection:

# autolog captures: accuracy, loss
# you return: accuracy, custom_metric

# wrapper behavior:
# - keeps autolog's accuracy (autolog wins)
# - logs your custom_metric
# - warns: "param collision detected: ['accuracy']"
# - suggests: "prefix custom metrics (e.g., 'custom.accuracy')"

model logging

supported frameworks (auto-detected):

catboost: ✅ mlflow.catboost.log_model()
sklearn: ✅ mlflow.sklearn.log_model()
xgboost: ✅ mlflow.xgboost.log_model()
lightgbm: ✅ mlflow.lightgbm.log_model()
pytorch: ✅ mlflow.pytorch.log_model()
tensorflow: ✅ mlflow.tensorflow.log_model()

signature handling:

auto-generates mlflow signature from model_inputs and model_predictions
converts int64 columns to double (prevents schema errors with boolean features)
handles missing values at inference time

when to use:

training runs: enable with log_model=True
local development: disable with log_model=False to save storage
CI/CD: control via MLFLOW_LOG_MODEL environment variable

return dict contract

{
    'params': dict,       # required if autolog fails, optional otherwise
    'metrics': dict,      # optional (custom metrics)
    'model': object,      # optional (trained model - triggers auto logging)
    'model_inputs': DataFrame/array,  # optional (for signature inference, recommend .head(5))
    'model_predictions': array,  # optional (for signature output schema)
    'signature': ModelSignature,  # optional (auto-generated if model_inputs provided)
    'input_example': DataFrame/array,  # optional (auto-generated from model_inputs)
    'artifact_path': str,  # optional (custom model artifact path, default: 'model')
    'artifacts': str,     # optional (path to additional artifact dir/file)
    'model_local_path': str,  # optional (legacy - for model registry)
    'model_name': str     # optional (legacy - for model registry)
}

error handling

@mlflow_experiment(experiment_name="test")
def train():
    raise ValueError("something broke")

# wrapper behavior:
# - catches exception
# - logs tags:
#   - mlops.status = "failed"
#   - mlops.error_type = "ValueError"
#   - mlops.error_message = "something broke"
# - re-raises exception
# - closes mlflow run

ci/cd integration

github actions example

- name: train model
  env:
    MLFLOW_TRACKING_URI: https://mlflow.company.com
    AWS_PROFILE: ml-dev # or set AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY or github aws jwt auth
  run: |
    python training/train.py --run-prefix ci-${{ github.run_id }}

wrapper automatically picks up:

MLFLOW_TRACKING_URI env var
AWS_PROFILE for boto3 session

testing

verify wrapper is working

# test_wrapper.py
from mlops_wrapper.mlflow_wrapper import mlflow_experiment
import os

os.environ["MLFLOW_TRACKING_URI"] = "file:./mlruns"

@mlflow_experiment(
    experiment_name="test-run",
    run_name="verify-wrapper-works",
    tags={'test': 'true'}
)
def test_train():
    return {
        'params': {'test_param': 123},
        'metrics': {'test_metric': 0.99}
    }

test_train()
print("✓ check ./mlruns for logged experiment")

run:

python test_wrapper.py
mlflow ui --backend-store-uri file:./mlruns

open http://localhost:5000 and verify:

experiment "test-run" exists
run "verify-wrapper-works" logged
tags, params, metrics present

migration from manual mlflow

before (manual):

def train():
    mlflow.set_tracking_uri(...)
    mlflow.set_experiment(...)

    with mlflow.start_run(run_name=...):
        mlflow.set_tag(...)
        # training
        mlflow.log_param(...)
        mlflow.log_metric(...)
        mlflow.log_artifacts(...)
    # 50+ lines

after (wrapper):

@mlflow_experiment(experiment_name=..., run_name=..., tags={...})
def train():
    # training
    return {'params': {...}, 'metrics': {...}}
# 5 lines

requirements

python >=3.8
mlflow >=2.9.0 (supports 3.x)
boto3 >=1.28.0 (for S3 artifact storage)

contributing

issues and PRs welcome at https://github.com/Iyalvan/mlops_wrapper

license

see LICENSE file

Project details

Release history Release notifications | RSS feed

0.3.1

Apr 6, 2026

This version

0.3.0

Jan 28, 2026

0.1.4

Mar 24, 2025

0.1.2

Mar 23, 2025

0.1.1

Mar 12, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlops_wrapper-0.3.0.tar.gz (20.5 kB view details)

Uploaded Jan 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mlops_wrapper-0.3.0-py3-none-any.whl (13.1 kB view details)

Uploaded Jan 28, 2026 Python 3

File details

Details for the file mlops_wrapper-0.3.0.tar.gz.

File metadata

Download URL: mlops_wrapper-0.3.0.tar.gz
Upload date: Jan 28, 2026
Size: 20.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for mlops_wrapper-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`0c9345328f4a62f46966afbe8a00b6fc2beaf4339aad4acc2e8fbd46e10a1be0`
MD5	`15b5e25e19d8e185e5720aa157a17c89`
BLAKE2b-256	`c6b34ce259d95b23e6b3e9335fd58e155ac341dc4295a3321dcdc72e19d39720`

See more details on using hashes here.

File details

Details for the file mlops_wrapper-0.3.0-py3-none-any.whl.

File metadata

Download URL: mlops_wrapper-0.3.0-py3-none-any.whl
Upload date: Jan 28, 2026
Size: 13.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for mlops_wrapper-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`63b9168a5fe246084f458d3233f1d7995663afad4a2794cb72c292fe24fb9ac6`
MD5	`2f51ffe51bd91e7e6fd5b582033b6ccc`
BLAKE2b-256	`a7ede2623fc761b2606a1d6a46df4059f0f16b02a1bf186ff06342bbefe7a5af`

See more details on using hashes here.

mlops-wrapper 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

mlops_wrapper

what is this?

why use this?

design principles

responsibilities

installation

usage

basic example (sklearn - autolog works)

catboost example (autolog fails - manual fallback)

model logging example

project-level integration pattern

autolog behavior

model logging

return dict contract

error handling

ci/cd integration

github actions example

testing

verify wrapper is working

migration from manual mlflow

requirements

contributing

license

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes