Skip to main content

Scikit-longitudinal, an open-source Python lib for longitudinal data analysis, builds on Scikit-learn's foundation. It offers specialised tools to tackle challenges of repeated measures data, ideal for (med.) researchers, data scientists, & analysts.

Project description


Scikit-longitudinal
Scikit-longitudinal

A specialised Python library for longitudinal data analysis built on Scikit-learn

⚙️ Project Status

☎️ Contacts

pdm pytest
Codecov
flake8
pylint
pre-commit
isort
black
autopep8
Microsoft Outlook
LinkedIn
Stack Overflow
Google Scholar

🌟 Exciting Update: We're delighted to introduce the brand new v0.1 documentation for Scikit-longitudinal! For a deep dive into the library's capabilities and features, please visit here.

🎉 PyPi is available!: We published Scikit-Longitudinal, here!

💡 About The Project

Scikit-longitudinal (Sklong) is a machine learning library designed to analyse longitudinal data (Classification tasks focussed as of today). It offers tools and models for processing, analysing, and predicting longitudinal data, with a user-friendly interface that integrates with the Scikit-learn ecosystem.

Please for further information, visit the official documentation.

🛠️ Installation

To install Sklong, take these two easy steps:

  1. Install the latest version of Sklong:
pip install Scikit-longitudinal

You could also install different versions of the library by specifying the version number, e.g. pip install Scikit-longitudinal==0.0.1. Refer to Release Notes

  1. 📦 [MANDATORY] Update the required dependencies (Why? See here)

Scikit-longitudinal incorporates a modified version of Scikit-Learn called Scikit-Lexicographical-Trees, which can be found at this Pypi link.

This revised version guarantees compatibility with the unique features of Scikit-longitudinal. Nevertheless, conflicts may occur with other dependencies in Scikit-longitudinal that also require Scikit-Learn. Follow these steps to prevent any issues when running your project.

🫵 Simple Setup: Command Line Installation

Say you want to try Sklong in a very simple environment. Such as without a proper project.toml file (Poetry, PDM, etc). Run the following command:

pip uninstall scikit-learn scikit-lexicographical-trees && pip install scikit-lexicographical-trees

Note: Although the main installation command install both, yet it’s advisable to verify the correct versions used is Scikit-Lexicographical-trees to prevent conflicts.

🫵 Project Setup: Using `PDM` (or any other such as `Poetry`, etc.)

Imagine you have a project being managed by PDM, or any other package manager. The example below demonstrates PDM. Nevertheless, the process is similar for Poetry and others. Consult their documentation for instructions on excluding a package.

Therefore, to prevent dependency conflicts, you can exclude Scikit-Learn by adding the provided configuration to your pyproject.toml file.

[tool.pdm.resolution]
excludes = ["scikit-learn"]

This exclusion ensures Scikit-Lexicographical-Trees (used as Scikit-learn) is used seamlessly within your project.

💻 Developer Notes

For developers looking to contribute, please refer to the Contributing section of the official documentation.

🛠️ Supported Operating Systems

Scikit-longitudinal is compatible with the following operating systems:

  • MacOS 
  • Linux 🐧
  • Windows via Docker only (Docker uses Linux containers) 🪟 (To try without but we haven't tested it)

🚀 Getting Started

To perform longitudinal analysis with Scikit-Longitudinal, use the LongitudinalDataset class to prepare the dataset. To analyse your data, use the LexicoGradientBoostingClassifier (i.e. Gradient Boosting variant for Longitudinal Data) or another available estimator/preprocessor.

Following that, you can apply the popular fit, predict, prodict_proba, or transform methods in the same way that Scikit-learn does, as shown in the example below.

from scikit_longitudinal.data_preparation import LongitudinalDataset
from scikit_longitudinal.estimators.ensemble.lexicographical.lexico_gradient_boosting import LexicoGradientBoostingClassifier

dataset = LongitudinalDataset('./stroke.csv')
dataset.load_data_target_train_test_split(
  target_column="class_stroke_wave_4",
)

# Pre-set or manually set your temporal dependencies 
dataset.setup_features_group(input_data="Elsa")

model = LexicoGradientBoostingClassifier(
  features_group=dataset.feature_groups(),
  threshold_gain=0.00015
)

model.fit(dataset.X_train, dataset.y_train)
y_pred = model.predict(dataset.X_test)

# Classification report
print(classification_report(y_test, y_pred))

📝 How to Cite?

Paper has been submitted to a conference. In the meantime, for the repository, utilise the button top right corner of the repository "How to cite?", or open the following citation file: CITATION.cff.

🔐 License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scikit_longitudinal-0.0.6.tar.gz (1.6 MB view details)

Uploaded Source

Built Distribution

scikit_longitudinal-0.0.6-py3-none-any.whl (1.6 MB view details)

Uploaded Python 3

File details

Details for the file scikit_longitudinal-0.0.6.tar.gz.

File metadata

  • Download URL: scikit_longitudinal-0.0.6.tar.gz
  • Upload date:
  • Size: 1.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: pdm/2.17.2 CPython/3.10.12 Linux/6.5.0-1024-azure

File hashes

Hashes for scikit_longitudinal-0.0.6.tar.gz
Algorithm Hash digest
SHA256 0eba83581c0e61196bff6ab314e62c273546df0d07a32f454f6e8ca23ce7a99e
MD5 db348226b7390fbbffa1c9838228c7d0
BLAKE2b-256 235b1a2342cba81afc2b65a2e73925881dc4bb5fe43ef3cc2a6c41f362130b21

See more details on using hashes here.

File details

Details for the file scikit_longitudinal-0.0.6-py3-none-any.whl.

File metadata

File hashes

Hashes for scikit_longitudinal-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 eaeb6bf63ba31389b2af853392dca57bec912fdd6b4e79ded9aa3127a1fa0f17
MD5 f16f244ec892c7e488ccecba077390ba
BLAKE2b-256 4e86cee9d40623e8d4c983f1772f9986574f36d2d67593076679cfc5f5dbd84e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page