Skip to main content

Quickly compare machine learning models across libraries and datasets.

Project description

MLCompare Logo

Supported Python Versions PyPI - Version PyPI - License
Read the Docs GitHub Actions Workflow Status GitHub Actions status (MacOS Unit Tests) Code Coverage

This library is still in early development. Expect many more features to come :D

MLCompare is a Python package for running model comparison pipelines, with the aim of being both simple and flexible. It supports multiple popular ML libraries, retrieval from multiple online dataset repositories, common data processing steps, and results visualization. Additionally, it allows for using your own models and datasets within the pipelines.

Libraries
Datasets
Data Processing
  • Scikit-learn
  • XGBoost
  • Kaggle
  • Hugging Face
  • OpenML
  • Locally saved
  • Encode: One-hot | Label
  • Drop Columns

Installing

pip install mlcompare

Note that for MacOS, both XGBoost and LightGBM require libomp. It can be installed with Homebrew:

brew install libomp

A Simple Example

Running a pipeline with multiple models and datasets is done by making list of dictionaries for each and providing them to a pipeline function.

The below example downloads a dataset from OpenML and Kaggle, one-hot encodes some of the columns in the Kaggle dataset, and trains and evaluates a Random Forest and XGBoost model on them.

import mlcompare

datasets = [
    {
        "type": "openml",
        "id": 8,
        "target": "drinks",
    },
    {
        "type": "kaggle",
        "user": "gorororororo23",
        "dataset": "plant-growth-data-classification",
        "file": "plant_growth_data.csv",
        "target": "Growth_Milestone",
        "onehotEncode": ["Soil_Type", "Water_Frequency", "Fertilizer_Type"],
    }
]

models = [
    {
        "library": "sklearn",
        "name": "RandomForestRegressor",
    },
    {
        "library": "xgboost",
        "name": "XGBRegressor",
        "params": {"num_leaves": 40, "n_estimators": 200}
    }
]

mlcompare.full_pipeline(datasets, models)

In the case of the XGBoost model we passed in our own parameter values rather than using the defaults.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlcompare-1.1.0.tar.gz (31.1 kB view details)

Uploaded Source

Built Distribution

mlcompare-1.1.0-py3-none-any.whl (23.6 kB view details)

Uploaded Python 3

File details

Details for the file mlcompare-1.1.0.tar.gz.

File metadata

  • Download URL: mlcompare-1.1.0.tar.gz
  • Upload date:
  • Size: 31.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for mlcompare-1.1.0.tar.gz
Algorithm Hash digest
SHA256 4bedae32d468694e5f2ade19241a712a22b63ddca8e6a397be7ff1b54bedec34
MD5 7740bb43b1717c841dc6d1139bee04a3
BLAKE2b-256 9032c0c54003debc690db236ac01d6e1d5ed344d90ffcd6dd4770e175d76b90b

See more details on using hashes here.

File details

Details for the file mlcompare-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: mlcompare-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 23.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for mlcompare-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 55e6c0380e964d72ba892e6c21853bf6931dc2c71e0c4afbcfcd40954c85bbd7
MD5 efe0dfc6662a138c30009977fa6e7416
BLAKE2b-256 f9535a0dce965643c73c4b7f462b954c009d51ee4eb639dc11da5e4f45eee2c1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page