Scikit-longitudinal, an open-source Python lib for longitudinal data analysis, builds on Scikit-learn's foundation. It offers specialised tools to tackle challenges of repeated measures data, ideal for (med.) researchers, data scientists, & analysts.
Project description
Scikit-longitudinal
A specialised Python library for longitudinal data analysis built on Scikit-learn
⚙️ Project Status |
☎️ Contacts |
|||||
|
|
🌟 Exciting Update: We're delighted to introduce the brand new v0.1 documentation for Scikit-longitudinal! For a deep dive into the library's capabilities and features, please visit here.
🎉 PyPi is available!: We published Scikit-Longitudinal, here!
💡 About The Project
Scikit-longitudinal
(Sklong) is a machine learning library designed to analyse
longitudinal data (Classification tasks focussed as of today). It offers tools and models for processing, analysing,
and predicting longitudinal data, with a user-friendly interface that
integrates with the Scikit-learn
ecosystem.
Please for further information, visit the official documentation.
🛠️ Installation
To install Sklong
, take these two easy steps:
- ✅ Install the latest version of
Sklong
:
pip install Scikit-longitudinal
You could also install different versions of the library by specifying the version number,
e.g. pip install Scikit-longitudinal==0.0.1
.
Refer to Release Notes
- 📦 [MANDATORY] Update the required dependencies (Why? See here)
Scikit-longitudinal
incorporates a modified version of Scikit-Learn
called Scikit-Lexicographical-Trees
,
which can be found at this Pypi link.
This revised version guarantees compatibility with the unique features of Scikit-longitudinal
.
Nevertheless, conflicts may occur with other dependencies in Scikit-longitudinal
that also require Scikit-Learn
.
Follow these steps to prevent any issues when running your project.
🫵 Simple Setup: Command Line Installation
Say you want to try Sklong
in a very simple environment. Such as without a proper project.toml
file (Poetry
, PDM
, etc).
Run the following command:
pip uninstall scikit-learn scikit-lexicographical-trees && pip install scikit-lexicographical-trees
Note: Although the main installation command install both, yet it’s advisable to verify the correct versions used is
Scikit-Lexicographical-trees
to prevent conflicts.
🫵 Project Setup: Using `PDM` (or any other such as `Poetry`, etc.)
Imagine you have a project being managed by PDM
, or any other package manager. The example below demonstrates PDM
.
Nevertheless, the process is similar for Poetry
and others. Consult their documentation for instructions on excluding a
package.
Therefore, to prevent dependency conflicts, you can exclude Scikit-Learn
by adding the provided configuration
to your pyproject.toml
file.
[tool.pdm.resolution]
excludes = ["scikit-learn"]
This exclusion ensures Scikit-Lexicographical-Trees (used as Scikit-learn
) is used seamlessly within your project.
💻 Developer Notes
For developers looking to contribute, please refer to the Contributing
section of the official documentation.
🛠️ Supported Operating Systems
Scikit-longitudinal
is compatible with the following operating systems:
- MacOS
- Linux 🐧
- Windows via Docker only (Docker uses Linux containers) 🪟 (To try without but we haven't tested it)
🚀 Getting Started
To perform longitudinal analysis with Scikit-Longitudinal
, use the
LongitudinalDataset
class to prepare the dataset. To analyse your
data, use the LexicoGradientBoostingClassifier
(i.e. Gradient Boosting variant for Longitudinal Data) or another
available
estimator/preprocessor.
Following that, you can apply the popular fit, predict, prodict_proba, or transform
methods in the same way that Scikit-learn
does, as shown in the example below.
from scikit_longitudinal.data_preparation import LongitudinalDataset
from scikit_longitudinal.estimators.ensemble.lexicographical.lexico_gradient_boosting import LexicoGradientBoostingClassifier
dataset = LongitudinalDataset('./stroke.csv')
dataset.load_data_target_train_test_split(
target_column="class_stroke_wave_4",
)
# Pre-set or manually set your temporal dependencies
dataset.setup_features_group(input_data="Elsa")
model = LexicoGradientBoostingClassifier(
features_group=dataset.feature_groups(),
threshold_gain=0.00015
)
model.fit(dataset.X_train, dataset.y_train)
y_pred = model.predict(dataset.X_test)
# Classification report
print(classification_report(y_test, y_pred))
📝 How to Cite?
Paper has been submitted to a conference. In the meantime, for the repository, utilise the button top right corner of the repository "How to cite?", or open the following citation file: CITATION.cff.
🔐 License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for scikit_longitudinal-0.0.6.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0eba83581c0e61196bff6ab314e62c273546df0d07a32f454f6e8ca23ce7a99e |
|
MD5 | db348226b7390fbbffa1c9838228c7d0 |
|
BLAKE2b-256 | 235b1a2342cba81afc2b65a2e73925881dc4bb5fe43ef3cc2a6c41f362130b21 |
Hashes for scikit_longitudinal-0.0.6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | eaeb6bf63ba31389b2af853392dca57bec912fdd6b4e79ded9aa3127a1fa0f17 |
|
MD5 | f16f244ec892c7e488ccecba077390ba |
|
BLAKE2b-256 | 4e86cee9d40623e8d4c983f1772f9986574f36d2d67593076679cfc5f5dbd84e |