Skip to main content

A package for automated machine learning based on scikit-learn and sklong to tackle the longitudinal machine learning classificationt tasks.

Project description


Auto-Sklong
Auto-Sklong

An Automated Machine Learning library for longitudinal classification built on GAMA and Scikit-longitudinal


💡 About The Project

Auto-Scikit-Longitudinal (Auto-Sklong) is an Automated Machine Learning (AutoML) library, developed upon the General Machine Learning Assistant (GAMA) framework, introduces a brand-new search space leveraging both Scikit-Longitudinal and Scikit-learn models to tackle the Longitudinal machine learning classification tasks.

Auto-Sklong comes with various search methods to explore the search space introduced, such as Bayesian Optimisation.

For more details, visit the official documentation.


🛠️ Installation

[!NOTE] Want to use Jupyter Notebook, Marimo, Google Colab, or JupyterLab? Head to the Getting Started section of the documentation for full instructions! 🎉

To install Auto-Sklong:

  1. ✅ Install the latest version:

    pip install auto-sklong
    

    To install a specific version:

    pip install auto-sklong==0.0.1
    

[!CAUTION] Auto-Sklong is currently compatible with Python versions 3.9 only. Ensure you have this version installed before proceeding.

This limitation stems from the Deep Forest dependency. Follow updates on this GitHub issue.

If you encounter errors, explore the installation section in the Getting Started of the documentation. If issues persist, open a GitHub issue.


🚀 Getting Started

Here's how to run AutoML on longitudinal data with Auto-Sklong:

from sklearn.metrics import classification_report
from scikit_longitudinal.data_preparation import LongitudinalDataset
from gama.GamaLongitudinalClassifier import GamaLongitudinalClassifier

# Load your dataset (replace 'stroke.csv' with your actual dataset path)
dataset = LongitudinalDataset('./stroke.csv')

# Set up the target column and split the data (replace 'class_stroke_wave_4' with your target)
dataset.load_data_target_train_test_split(
    target_column="class_stroke_wave_4",
)

# Set up feature groups (temporal dependencies)
# Use a pre-set for ELSA data or define manually (See docs for details)
dataset.setup_features_group(input_data="elsa")

# Initialise the AutoML system
automl = GamaLongitudinalClassifier(
    features_group=dataset.feature_groups(),
    non_longitudinal_features=dataset.non_longitudinal_features(),
    feature_list_names=dataset.data.columns.tolist(),
    max_total_time=3600  # Adjust time as needed (in seconds)
)

# Fit the AutoML system
automl.fit(dataset.X_train, dataset.y_train)

# Make predictions
y_pred = automl.predict(dataset.X_test)

# Print the classification report
print(classification_report(dataset.y_test, y_pred))

📝 How to Cite

If you use Auto-Sklong in your research, please cite our paper:

@INPROCEEDINGS{10821737,
  author={Provost, Simon and Freitas, Alex A.},
  booktitle={2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)}, 
  title={Auto-Sklong: A New AutoML System for Longitudinal Classification}, 
  year={2024},
  volume={},
  number={},
  pages={2021-2028},
  keywords={Pipelines;Optimization;Predictive models;Classification algorithms;Conferences;Bioinformatics;Biomedical computing;Automated Machine Learning;AutoML;Longitudinal Classification;Scikit-Longitudinal;GAMA},
  doi={10.1109/BIBM62325.2024.10821737}}

🚀 What's New Compared to GAMA?

We enhanced @PGijsbers' open-source GAMA initiative by introducing a brand-new search space designed specifically for tackling longitudinal classification problems. This search space is powered by our custom library, Scikit-Longitudinal (Sklong), enabling Combined Algorithm Selection and Hyperparameter Optimization (CASH Optimization).

Unlike GAMA or other existing AutoML libraries, Auto-Sklong offers out-of-the-box support for longitudinal classification tasks—a capability not previously available.

Search Space Viz.:

To better understand our proposed search space, refer to the visualisation below (read from left to right, each step being one new component to a final pipeline candidate configuration):

Search Space Visualization

While GAMA offers some configurability for search spaces, we improved its functionality to better suit our needs. You can find the details of our contributions in the following pull requests:

🔐 License

Auto-Sklong is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auto_sklong-0.0.5.tar.gz (91.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

auto_sklong-0.0.5-py3-none-any.whl (130.2 kB view details)

Uploaded Python 3

File details

Details for the file auto_sklong-0.0.5.tar.gz.

File metadata

  • Download URL: auto_sklong-0.0.5.tar.gz
  • Upload date:
  • Size: 91.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.11

File hashes

Hashes for auto_sklong-0.0.5.tar.gz
Algorithm Hash digest
SHA256 358075a19b02c40397ef2933a9b550354507794bf58bf4e0fb8c59f1e57b19dc
MD5 9655f2c95011286bcc5ef90f660df16d
BLAKE2b-256 db13b0cb2293cd36278748e3aabd59d1dde444bf9e573a662a0a08e2995998bc

See more details on using hashes here.

File details

Details for the file auto_sklong-0.0.5-py3-none-any.whl.

File metadata

File hashes

Hashes for auto_sklong-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 284743e43dcf0689e1776cac658d8f1a8f7eb25a01895403acb8e7af609e26ea
MD5 50b2167776e489de0a06b0f3b52484e4
BLAKE2b-256 51fd66e1dd48e7382abcbb925969c45c87827ec1c89b23371f819dd42b0d2f57

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page