An easy-to-use, no boilerplate ML experiment tracker.
Project description
Squid ML
A no-boilerplate, ease-to-use AI/ML experiment tracker.
Do you find yourself spending more time setting up MLflow and the related infrastructure, compared to actually building models and data pipelines? Squid ML is here to help - log model training runs, artifacts, metrics, and more, using just 2 lines of code!
Features
- Quickly set up the tracking infrastructure: This repo uses MLfLow for logging experiments, MinIO as an artifact store, and PostgreSQL as the backend store for MLflow.
- Easily log experiments, runs, and artifacts: Use decorators to wrap the pipeline, which can then log the model training and model evaluation metrics.
Scikit-learn,PyTorch, andTensorFloware supported as of Feb 4, 2025.
Installation
- Ensure that
DockerandDocker Compose V2are installed and working on your machine. - Use pip to install the package.
pip install squid-ml
- Building the package from source (optional).
git clone https://github.com/ar-bansal/squid-ml.git
cd squid-ml
python -m build
pip install dist/squid_ml-0.1.1-py3-none-any.whl
Usage
- Start the tracking server: If you already have a tracking server set up, just call
mlflow.set_tracking_uri(...)with your tracking URI.
from squid import Server
# Default project_name is the current working directory's basename
tracking_server = Server(project_name="my-project")
tracking_server.start(
quiet=False, # Setting it to False will print logs on the terminal/cell
use_current_env=True # Automatically fetches Python and MLflow versions from your currently active environment's version for the best compatibility.
)
(OR)
# Python and mlflow version need to be specified the first time.
tracking_server.start(
quiet=False, # Setting it to False will print logs on the terminal/cell
python_version="3.10", # Match with your environment's version for the best compatibility
mlflow_version="2.18.0" # Match with your environment's version for the best compatibility
)
- Use the logging decorators: While wrapping your pipeline, add
*argsand**kwargsas parameters. While calling the function, passexperiment_nameas a keyword argument.
from squid import SklearnLogger
from sklearn.linear_model import LinearRegression
def train_model(model, X_train, y_train):
...
def evaluate_model(model, X_test, y_test):
...
# Default logging_kwargs={}
# Refer to mlflow.sklearn.autolog's documentation for more logging_kwargs
sklearn_logger = SklearnLogger(
logging_kwargs={
"serialization_format": mlflow.sklearn.SERIALIZATION_FORMAT_PICKLE
}
)
@sklearn_logger.log
def run_pipeline(X_train, X_test, y_train, y_test, model, *args, **kwargs):
# call train_model to get the model and training metrics
# call evaluate_model to get the validation/test metrics
# create a single dictionary with all the metrics that need to be logged
# return model and metrics to use the decorator to log the model and metrics
return model, metrics
def main():
model = LinearRegression()
X_train, X_test, y_train, y_test = ...
model, metrics = run_pipeline(
X_train,
X_test,
y_train,
y_test,
model,
experiment_name="my-experiment"
)
Notes
By default, the following values are used as username and passwords for the PostgreSQL and MinIO containers respectively:
DB_USERNAME=dbuser
DB_PASSWORD=dbpassword
ARTIFACT_STORE_ACCESS_KEY=storeuser
ARTIFACT_STORE_SECRET_KEY=storepassword
If you'd like to use different credentials for them, simply set the environment variables using os.
import os
os.environ["DB_USERNAME"] = mynewuser
os.environ["DB_PASSWORD"] = mynewpassword
os.environ["ARTIFACT_STORE_ACCESS_KEY"] = mynewuser
os.environ["ARTIFACT_STORE_SECRET_KEY"] = mynewpassword
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file squid_ml-0.1.2.tar.gz.
File metadata
- Download URL: squid_ml-0.1.2.tar.gz
- Upload date:
- Size: 13.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6f7aa190c8848a8b821e2f44f44a62c0ac3e170f036dc2e1c4c4518c24492be2
|
|
| MD5 |
9adbb471a6d90688b5d7ba97479efcbe
|
|
| BLAKE2b-256 |
fff16cd1894dbd58c7eb7cc69670c7ebb4d3f15b0cb443de4a0acc8f8c371aa4
|
File details
Details for the file squid_ml-0.1.2-py3-none-any.whl.
File metadata
- Download URL: squid_ml-0.1.2-py3-none-any.whl
- Upload date:
- Size: 12.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8058ddfcadbaee5455341a145265fa0dfd765c76f4e25d15b95010fae775e838
|
|
| MD5 |
a78066af956ebd269690bbe4d716fa70
|
|
| BLAKE2b-256 |
ca1f9eb3faf07b7d91f1b52aab5295f6fe4ba28600c06d3efb99b446b4135677
|