Skip to main content

CleverTap internal MLflow client library

Project description

ct-mlflow-lib

Thin MLflow client library for CleverTap datascience products.

Application code should use only this package for tracking (metrics, params, artifacts). Do not import mlflow in product code; use ct_mlflow_lib helpers so logging is a no-op when no run is active.

Install

Pin a git tag in your requirements.txt or pyproject.toml:

pip install "git+https://github.com/CleverTap-DS/MLflow@v0.1.0#subdirectory=ct_mlflow_lib"

Or with uv:

uv add "ct-mlflow-lib @ git+https://github.com/CleverTap-DS/MLflow@v0.1.0#subdirectory=ct_mlflow_lib"

Optional: Keras support

To use MLflowKerasCallback, install with the keras extra:

pip install "git+https://github.com/CleverTap-DS/MLflow@v0.1.0#subdirectory=ct_mlflow_lib[keras]"

The core library works without TensorFlow. MLflowKerasCallback will raise an ImportError if TensorFlow is not installed.

Quick Start

The primary API is try_init_product(). Pass the task, product, and env names; the experiment name is built as {product}.{env}.{task}.

import ct_mlflow_lib

success, reason = ct_mlflow_lib.try_init_product(
    "train",
    product="recommendation",
    env="prod",
    tags={
        "account_id": "123",
        "catalog_id": "456",
    },
)

if not success:
    logger.warning(f"MLflow disabled: {reason}")
else:
    ct_mlflow_lib.log_metric("auc", 0.87)
    ct_mlflow_lib.log_param("lr", 0.01)
    ct_mlflow_lib.end_mlflow(exit_code=0)

Logging helpers (log_metric, log_metrics, log_param, log_params, set_tag, log_artifact) are safe to call even when init failed or no run is active: they no-op.

API Reference

try_init_product(task: str, product: str, env: str, tags: dict | None = None) -> tuple[bool, str | None]

Initialize MLflow for a product run. Returns immediately with status instead of raising exceptions.

Args:

  • task: Task segment only (e.g. "train", "eval"). Full experiment name is {product}.{env}.{task} (e.g. recommendation.prod.train).
  • product: Product name (e.g. "recommendation", "prediction").
  • env: Deployment environment (e.g. "prod", "staging", "dev").
  • tags: Optional dict of product-specific tags (merged with defaults: product, env).

Returns: Tuple of (success: bool, reason: str | None)

  • If success=True: MLflow is initialized and a run is active. reason=None.
  • If success=False: MLflow init failed. reason explains why (e.g. MLFLOW_TRACKING_URI is not set, or a connection error).

Never raises. Gracefully handles missing config, network errors, etc.

log_metric, log_metrics, log_param, log_params, set_tag, log_artifact

Delegate to MLflow when a run is active; otherwise no-op. Never raise.

end_mlflow(exit_code: int | None = None, error_message: str | None = None) -> None

End the current MLflow run and log exit status. Call this at the end of a job.

Never raises. Safe to call even if no run is active.

try:
    ct_mlflow_lib.log_metric("auc", 0.87)
    ct_mlflow_lib.end_mlflow(exit_code=0)
except Exception as e:
    ct_mlflow_lib.end_mlflow(exit_code=1, error_message=str(e))

is_mlflow_active() -> bool

Check if a run is currently active.

get_mlflow_run_id() -> str | None

Get the current run ID, or None if no run is active.

MLflowKerasCallback(prefix: str = "")

Keras callback for logging epoch-level metrics to MLflow. Use with model.fit().

import tensorflow as tf
from ct_mlflow_lib import MLflowKerasCallback

model = tf.keras.Sequential([...])
model.fit(
    x_train, y_train,
    epochs=10,
    callbacks=[MLflowKerasCallback(prefix="model")],
)

Experiment naming convention

The experiment name is always {product}.{env}.{task}.

product env task Experiment name
recommendation prod train recommendation.prod.train
prediction staging train prediction.staging.train

Required environment variables

Set these in ECS task definitions, Batch job definitions, EC2 launch templates, or local .env.

Variable Required for init Description Example
MLFLOW_TRACKING_URI Yes MLflow server URL (no # fragment) https://mlflow.example.com

If MLFLOW_TRACKING_URI is missing, try_init_product() returns (False, reason) instead of raising.

Cloudflare Access (production)

When the tracking server is behind Cloudflare Access, batch jobs must send a service token on every HTTP request. This library registers an MLflow RequestHeaderProvider entry point that adds the headers when credentials are present.

Variable When Description
CF_ACCESS_CLIENT_ID Production behind Access Service token client ID
CF_ACCESS_CLIENT_SECRET Production behind Access Service token client secret

If either variable is unset, the provider does not inject headers (safe for local dev or servers not behind Access).

Error handling examples

Example 1: MLflow is optional (typical case)

import ct_mlflow_lib

success, reason = ct_mlflow_lib.try_init_product(
    "train",
    product="recommendation",
    env="prod",
    tags={"account_id": account_id},
)

if not success:
    logger.warning(f"MLflow not available ({reason}), training without tracking")
else:
    logger.info(f"MLflow tracking enabled (run: {ct_mlflow_lib.get_mlflow_run_id()})")

# Safe without guards: helpers no-op when inactive
ct_mlflow_lib.log_metric("auc", 0.87)
ct_mlflow_lib.end_mlflow(exit_code=0)

Example 2: Wrap in product-specific helper

import ct_mlflow_lib

def init_mlflow_for_recommendation(account_id, catalog_id, recommendation_id, env):
    success, reason = ct_mlflow_lib.try_init_product(
        "train",
        product="recommendation",
        env=env,
        tags={
            "account_id": str(account_id),
            "catalog_id": str(catalog_id),
            "recommendation_id": str(recommendation_id),
        },
    )
    if not success:
        logger.warning(f"MLflow disabled: {reason}")
    return success

Development

Run pre-commit install before coding.

Install dev dependencies, then run tests:

cd ct_mlflow_lib
uv sync --group dev
pytest tests/

With pip only (no uv), use the optional dev extra:

pip install -e ".[dev]"
pytest tests/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ct_mlflow_lib-0.3.0.tar.gz (10.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ct_mlflow_lib-0.3.0-py3-none-any.whl (7.5 kB view details)

Uploaded Python 3

File details

Details for the file ct_mlflow_lib-0.3.0.tar.gz.

File metadata

  • Download URL: ct_mlflow_lib-0.3.0.tar.gz
  • Upload date:
  • Size: 10.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for ct_mlflow_lib-0.3.0.tar.gz
Algorithm Hash digest
SHA256 831d0d8846f3f26171d9a4a02814d07d78d05cb13addd8f987720594bebee4be
MD5 6226bb90417773779208e1ce7f843d0a
BLAKE2b-256 d9ad85eaaf814754d4d50c6a20e9421f9a157756dc11fd04603d6b765724984d

See more details on using hashes here.

File details

Details for the file ct_mlflow_lib-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: ct_mlflow_lib-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 7.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for ct_mlflow_lib-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bcca64cdd15a53d6fddd7d9977fe43cc80130d22566b24913a6301529c44d884
MD5 4a58af3f831826131c10203149d10b3c
BLAKE2b-256 fa78043bf54c85fd700549fafe5a2aae1c0390d3734346da61917650bac423f0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page