A Python library to integrate IBM COS with MLFlow artifact registry
Project description
MLflow IBM COS Registry
A Python library that integrates IBM Cloud Object Storage (COS) with MLflow for model registry capabilities. This package provides an extended MLflow artifact repository implementation that leverages IBM COS for storing, versioning, and retrieving machine learning models.
Features
- Store and manage ML models in IBM Cloud Object Storage
- Versioning with model fingerprinting
- Specialized support for "latest" model version
- Efficient caching to avoid redundant downloads
- Integration with MLflow's PyFunc model flavor
Installation
Install the package using pip or any other package manager:
pip install mlflow-ibmcos
Or install from source:
git clone https://github.com/donielix/mlflow-ibm-cos-registry.git
cd mlflow-ibm-cos-registry
pip install -e .
Requirements
- Python 3.8 or later
- IBM Cloud Object Storage account
- MLflow 2.15.0 or later
Quick Start
from mlflow_ibmcos import COSModelRegistry
# Initialize registry
registry = COSModelRegistry(
bucket="my-model-bucket",
model_name="text-classifier",
model_version="latest",
endpoint_url="https://s3.us-south.cloud-object-storage.appdomain.cloud",
aws_access_key_id="your-access-key",
aws_secret_access_key="your-secret-key"
)
# Log a model
registry.log_pyfunc_model_as_code(
model_code_path="path/to/model_code.py",
artifacts={"model": "path/to/model.pkl"}
)
# Download a model
local_path = registry.download_artifacts(dst_path="models")
# Load a model
model = registry.load_model(local_path)
# Make predictions
predictions = model.predict(data)
Authentication
The registry requires IBM COS credentials which can be provided in several ways:
-
Direct parameters:
registry = COSModelRegistry( # Required parameters model_name="my-model", model_version="1.0.0", # Authentication parameters bucket="my-bucket", endpoint_url="https://s3.example.com", aws_access_key_id="your-access-key", aws_secret_access_key="your-secret-key" )
-
Environment variables:
export AWS_ENDPOINT_URL="https://s3.example.com" export AWS_ACCESS_KEY_ID="your-access-key" export AWS_SECRET_ACCESS_KEY="your-secret-key" export COS_BUCKET_NAME="my-bucket"
registry = COSModelRegistry( model_name="my-model", model_version="1.0.0", )
Usage Examples
Uploading Models
Log a PyFunc Model as Code
# Upload a model defined in a Python file
registry.log_pyfunc_model_as_code(
model_code_path="path/to/model_code.py",
artifacts={
"model": "path/to/model.pkl",
"encoder": "path/to/encoder.pkl"
}
)
Log Model Artifacts Directly
# Upload model artifacts from a directory
registry.log_artifacts(local_dir="path/to/model_directory")
Downloading Models
# Download model artifacts to a specified directory
model_path = registry.download_artifacts(dst_path="models")
# Download and delete other versions
model_path = registry.download_artifacts(
dst_path="models",
delete_other_versions=True,
)
Working with Model Versions
Using the "latest" Tag
The "latest" tag is special and allows you to continually update a model:
registry = COSModelRegistry(
model_name="my-model",
model_version="latest",
# authentication parameters...
)
# Each time you log artifacts, it will update the "latest" version
registry.log_artifacts("path/to/model_dir")
When downloading a model with the "latest" tag, the registry will automatically fetch updates if the remote fingerprint differs from the local one.
Using Version Numbers
For stable versioning:
registry = COSModelRegistry(
model_name="my-model",
model_version="1.0.0", # Semantic versioning recommended
# authentication parameters...
)
Version-tagged models won't be overwritten when uploaded again - you'll need to use a different version or the "latest" tag.
Deleting Models
# Initialize registry pointing to the model version to delete
registry = COSModelRegistry(
model_name="my-model",
model_version="1.0.0",
# authentication parameters...
)
# Delete the model (requires confirmation)
registry.delete_model_version(confirm=True)
API Reference
COSModelRegistry
The main class for interacting with the IBM COS model registry.
COSModelRegistry(
model_name: str,
model_version: str,
bucket: Optional[str] = None,
prefix: Optional[str] = None,
**kwargs
)
Parameters:
model_name: Name of the modelmodel_version: Version of the model (can be a semantic version or "latest")bucket: IBM COS bucket name. If not provided, it will be fetched from COS_BUCKET_NAME environment variableprefix: Custom prefix for storage path (defaults to "traductor/registry")**kwargs: Additional parameters including:endpoint_url: IBM COS endpoint URLaws_access_key_id: Access key for IBM COSaws_secret_access_key: Secret key for IBM COSconfig: Additional configuration for the S3 client
Main Methods:
log_pyfunc_model_as_code(model_code_path, artifacts=None, **kwargs): Log a PyFunc modellog_artifacts(local_dir, artifact_path=None): Log model artifactsdownload_artifacts(artifact_path=None, dst_path=None, delete_other_versions=False): Download model artifactsload_model(model_local_path, **kwargs): Load a downloaded modeldelete_model_version(confirm=False): Delete a model version
Fingerprinting
The registry uses fingerprinting to track model changes and optimize downloads:
- A SHA-512 hash of the model directory is created when logging a model
- When downloading, the fingerprints are compared to avoid redundant downloads
- For "latest" models, differences in fingerprints trigger automatic updates
Development
Setting Up Development Environment
- Clone the repository
- Install development dependencies:
pip install -e ".[dev]"
- Install pre-commit hooks:
pre-commit install
Running Tests
pytest tests/
For coverage report:
pytest --cov=mlflow_ibmcos tests/
Contact
For issues, questions, or contributions, please contact:
- Daniel Diego Horcajuelo (dadiego91@hotmail.com)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mlflow_ibmcos-0.1.6.tar.gz.
File metadata
- Download URL: mlflow_ibmcos-0.1.6.tar.gz
- Upload date:
- Size: 224.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
04ccec4a2807daad5ada20c143ff8ad527405b10e33edc6dda57bd6b34835b46
|
|
| MD5 |
7802cb84c18843bdda8285945a9b32bf
|
|
| BLAKE2b-256 |
c2ef24c68d42b8028c2171ef5c5103f3969203fe86762ecec106422397a3929f
|
File details
Details for the file mlflow_ibmcos-0.1.6-py3-none-any.whl.
File metadata
- Download URL: mlflow_ibmcos-0.1.6-py3-none-any.whl
- Upload date:
- Size: 11.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bfabd191acb3fba644d9fbc52673de6412e78b00459928706a44f62365ef0921
|
|
| MD5 |
9b2173a28eb73c84b115b88b427fc145
|
|
| BLAKE2b-256 |
11a55f034745fb90ef5387b7a8a296bf0176a6920082e4807f92288f2317b8e2
|