Skip to main content

A package for automated machine learning based on scikit-learn and sklong to tackle the longitudinal machine learning classificationt tasks.

Project description


Auto-Sklong
Auto-Sklong

A specialised Python library for Automated Machine Learning (AutoML) of Longitudinal machine learning classification tasks built upon GAMA

⚙️ Project Status

☎️ Contacts

Fork
pdm
pytest
Codecov
pylint
pre-commit
black
Ruff
Microsoft Outlook
LinkedIn
Stack Overflow
Google Scholar

🌟 Exciting Update: We're delighted to introduce the brand new v0.1 documentation for Auto-Sklong! For a deep dive into the library's capabilities and features, please visit here.

🎉 PyPi is available!: We published Auto-Sklong, here!

💡 About The Project

Auto-Scikit-Longitudinal, also called Auto-Sklong is an automated machine learning (AutoML) library designed to analyse longitudinal data (Classification tasks focussed as of today) using various search methods. Namely, Bayesian Optimisation via SMAC3, Asynchronous Successive Halving, Evolutionary Algorithms, and Random Search via the General Automated Machine Learning Assistant (GAMA).

Auto-Sklong built upon GAMA, offers a brand-new search space to tackle the Longitudinal Machine Learning classification problems, with a user-friendly interface, similar to the popular Scikit paradigm.

Please for further information, visit the official documentation.

🛠️ Installation

pip install Auto-Sklong

You could also install different versions of the library by specifying the version number, e.g. pip install Auto-Sklong==0.0.1. Refer to Release Notes

🚀 What's new compared to GAMA?

We improved @PGijsbers' open-source GAMA initiative to propose a new search space that leverages our other newly-designed library Scikit-Longitudinal (Sklong) in order to tackle the longitudinal classification problems via Combined Algorithm Selection and Hyperparameter Optimization (CASH Optimization).

Worth noting that it previously was not possible with GAMA or any other AutoML libraries to the best of our knowledge (refer to the Related Projects in the official documentation nonetheless).

While GAMA is offering a way to update the search space, we had to improve GAMA to support a couple of new features as follow. Nonetheless, it is worth-noting that in the coming months, the current version of Auto-Sklong might speedy increase due to the following pull requests ongoing on GAMA:

As soon as we are able to publish those on GAMA, there will be a compatibility refactoring to align Auto-Sklong with the most recent version of GAMA. As a result, this section will be removed appropriately.

💻 Developer Notes

For developers looking to contribute, please refer to the Contributing section of GAMA here and Scikit-Longitudinal here.

🛠️ Supported Operating Systems

Auto-Sklong is compatible with the following operating systems:

  • MacOS 
  • Linux 🐧
  • On Windows 🪟, you are recommended to run the library within a Docker container under a Linux distribution.

🚀 Getting Started

To perform AutoML on your longitudinal analysis with Auto-Sklong, use the following two-easy-steps.

  • First, load and prepare your dataset using the LongitudinalDataset class of Sklong.

  • Second, use the GamaLongitudinalClassifier class of Auto-Sklong. Following instantiating it set up its hyperparameters or let default, you can apply the popular fit, predict, prodict_proba, methods in the same way that Scikit-learn does, as shown in the example below. It will then automatically search for the best model and hyperparameters for your dataset.

Refer to the documentation for more information on the GamaLongitudinalClassifier class.

from sklearn.metrics import classification_report
from scikit_longitudinal.data_preparation import LongitudinalDataset
from gama.GamaLongitudinalClassifier import GamaLongitudinalClassifier

# Load your longitudinal dataset
dataset = LongitudinalDataset('./stroke.csv')
dataset.load_data_target_train_test_split(
  target_column="class_stroke_wave_4",
)

# Pre-set or manually set your temporal dependencies 
dataset.setup_features_group(input_data="elsa")

# Instantiate the AutoML system
automl = GamaLongitudinalClassifier(
    features_group=dataset.features_group(),
    non_longitudinal_features=dataset.non_longitudinal_features(),
    feature_list_names=dataset.data.columns,
)

# Run the AutoML system to find the best model and hyperparameters
model.fit(dataset.X_train, dataset.y_train)

# Predictions and prediction probabilities
label_predictions = automl.predict(X_test)
probability_predictions = automl.predict_proba(X_test)

# Classification report
print(classification_report(y_test, label_predictions))

# Export a reproducible script of the champion model
automl.export_script() 

📝 How to Cite?

Paper has been submitted to a conference. In the meantime, for the repository, utilise the button top right corner of the repository "How to cite?", or open the following citation file: CITATION.cff.

🔐 License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auto_sklong-0.0.2.tar.gz (399.4 kB view hashes)

Uploaded Source

Built Distribution

auto_sklong-0.0.2-py3-none-any.whl (134.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page