A package for automated machine learning based on scikit-learn and sklong to tackle the longitudinal machine learning classification tasks.
Project description
Auto-Sklong
An Automated Machine Learning library for longitudinal classification built on GAMA and Scikit-longitudinal
💡 About The Project
Auto-Scikit-Longitudinal (Auto-Sklong) is an Automated Machine Learning (AutoML) library, developed upon the
General Machine Learning Assistant (GAMA) framework,
introducing a brand-new search space leveraging both
Scikit-Longitudinal and Scikit-learn
models to tackle the Longitudinal machine learning classification tasks.
For more scientific details, you can refer to our paper published by IEEE in the IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2024 Edition.
Auto-Sklong comes with various search methods to explore the search space introduced, such as Bayesian Optimisation. For more details, visit the official documentation.
🛠️ Installation
[!NOTE] Want to use
Jupyter Notebook,Marimo,Google Colab, orJupyterLab? Head to theGetting Startedsection of the documentation for full instructions! 🎉
To install Auto-Sklong:
-
✅ Install the latest version:
pip install auto-sklong
To install a specific version:
pip install auto-sklong==0.0.1
[!CAUTION]
Auto-Sklongis currently compatible with Python versions3.9only. Ensure you have this version installed before proceeding.This limitation stems from the
Deep Forestdependency. Follow updates on this GitHub issue.If you encounter errors, explore the
installationsection in theGetting Startedof the documentation. If issues persist, open a GitHub issue.
🚀 Getting Started
Here's how to run AutoML on longitudinal data with Auto-Sklong:
from sklearn.metrics import classification_report
from scikit_longitudinal.data_preparation import LongitudinalDataset
from gama.GamaLongitudinalClassifier import GamaLongitudinalClassifier
# Load your dataset (replace 'stroke.csv' with your actual dataset path)
dataset = LongitudinalDataset('./stroke.csv')
# Set up the target column and split the data (replace 'class_stroke_wave_4' with your target)
dataset.load_data_target_train_test_split(
target_column="class_stroke_wave_4",
)
# Set up feature groups (temporal dependencies)
# Use a pre-set for ELSA data or define manually (See docs for details)
dataset.setup_features_group(input_data="elsa")
# Initialise the AutoML system
automl = GamaLongitudinalClassifier(
features_group=dataset.feature_groups(),
non_longitudinal_features=dataset.non_longitudinal_features(),
feature_list_names=dataset.data.columns.tolist(),
max_total_time=3600 # Adjust time as needed (in seconds)
)
# Fit the AutoML system
automl.fit(dataset.X_train, dataset.y_train)
# Make predictions
y_pred = automl.predict(dataset.X_test)
# Print the classification report
print(classification_report(dataset.y_test, y_pred))
More detailed examples and tutorials can be found in the documentation!
📝 How to Cite
If you use Auto-Sklong in your research, please cite our paper:
@INPROCEEDINGS{10821737,
author={Provost, Simon and Freitas, Alex A.},
booktitle={2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)},
title={Auto-Sklong: A New AutoML System for Longitudinal Classification},
year={2024},
volume={},
number={},
pages={2021-2028},
keywords={Pipelines;Optimization;Predictive models;Classification algorithms;Conferences;Bioinformatics;Biomedical computing;Automated Machine Learning;AutoML;Longitudinal Classification;Scikit-Longitudinal;GAMA},
doi={10.1109/BIBM62325.2024.10821737}}
🚀 What's New Compared to GAMA?
We enhanced @PGijsbers' open-source GAMA initiative by introducing a brand-new search space designed specifically for tackling longitudinal classification problems. This search space is powered by our custom library, Scikit-Longitudinal (Sklong), enabling Combined Algorithm Selection and Hyperparameter Optimization (CASH Optimization).
Unlike GAMA or other existing AutoML libraries, Auto-Sklong offers out-of-the-box support for
longitudinal classification tasks—a capability not previously available.
Search Space Viz.:
To better understand our proposed search space, refer to the visualisation below (read from left to right, each step being one new component to a final pipeline candidate configuration):
While GAMA offers some configurability for search spaces, we improved its functionality to better suit our needs. You can find the details of our contributions in the following pull requests:
- ConfigSpace Technology Integration for Enhanced GAMA Configuration and Management 🥇
- Search Methods Enhancements to Avoid Duplicate Evaluated Pipelines 🥈
- SMAC3 Bayesian Optimisation Integration 🆕
🔐 License
Auto-Sklong is licensed under the MIT License.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file auto_sklong-0.0.9.tar.gz.
File metadata
- Download URL: auto_sklong-0.0.9.tar.gz
- Upload date:
- Size: 91.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d5ec7ab83a9a794d5ababe9879db12322558f6e8495cb52c992f402ea2e7a2d4
|
|
| MD5 |
bbdbe91feb5d5e214fda8702aea22e25
|
|
| BLAKE2b-256 |
98d6d68f2481939cfcda241514ab629338f6d9fa0689f86e7688e51eb4a74ddc
|
File details
Details for the file auto_sklong-0.0.9-py3-none-any.whl.
File metadata
- Download URL: auto_sklong-0.0.9-py3-none-any.whl
- Upload date:
- Size: 130.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
76648a64526e273bb84a9c2fceb8d719b2332f3f9e0050958ac6c7ffb405ee54
|
|
| MD5 |
a2cb69fac132e453f36819131bba15de
|
|
| BLAKE2b-256 |
4e0ef1f9d6c4f14c27511fe9331b937d42e031cecac7b1cd7151324cafaecf32
|