Scikit-longitudinal, an open-source Python lib for longitudinal data analysis, builds on Scikit-learn's foundation. It offers specialised tools to tackle challenges of repeated measures data, ideal for (med.) researchers, data scientists, & analysts.
Project description
Scikit-longitudinal
A specialised Python library for longitudinal data analysis built on Scikit-learn
⚙️ Project Status |
☎️ Contacts |
|||||
|
|
🌟 Exciting Update: We're delighted to introduce the brand new v0.1 documentation for Scikit-longitudinal! For a deep dive into the library's capabilities and features, please visit here.
🎉 PyPi is available!: We published Scikit-Longitudinal, here!
💡 About The Project
Scikit-longitudinal
(Sklong) is a machine learning library designed to analyse
longitudinal data (Classification tasks focussed as of today). It offers tools and models for processing, analysing,
and predicting longitudinal data, with a user-friendly interface that
integrates with the Scikit-learn
ecosystem.
Please for further information, visit the official documentation.
🛠️ Installation
To install Sklong
, take these two easy steps:
- ✅ Install the latest version of
Sklong
:
pip install Scikit-longitudinal
You could also install different versions of the library by specifying the version number,
e.g. pip install Scikit-longitudinal==0.0.1
.
Refer to Release Notes
- 📦 [MANDATORY] Update the required dependencies (Why? See here)
Scikit-longitudinal
incorporates a modified version of Scikit-Learn
called Scikit-Lexicographical-Trees
,
which can be found at this Pypi link.
This revised version guarantees compatibility with the unique features of Scikit-longitudinal
.
Nevertheless, conflicts may occur with other dependencies in Scikit-longitudinal
that also require Scikit-Learn
.
Follow these steps to prevent any issues when running your project.
🫵 Simple Setup: Command Line Installation
Say you want to try Sklong
in a very simple environment. Such as without a proper project.toml
file (Poetry
, PDM
, etc).
Run the following command:
pip uninstall scikit-learn scikit-lexicographical-trees && pip install scikit-lexicographical-trees
Note: Although the main installation command install both, yet it’s advisable to verify the correct versions used is
Scikit-Lexicographical-trees
to prevent conflicts.
🫵 Project Setup: Using `PDM` (or any other such as `Poetry`, etc.)
Imagine you have a project being managed by PDM
, or any other package manager. The example below demonstrates PDM
.
Nevertheless, the process is similar for Poetry
and others. Consult their documentation for instructions on excluding a
package.
Therefore, to prevent dependency conflicts, you can exclude Scikit-Learn
by adding the provided configuration
to your pyproject.toml
file.
[tool.pdm.resolution]
excludes = ["scikit-learn"]
This exclusion ensures Scikit-Lexicographical-Trees (used as Scikit-learn
) is used seamlessly within your project.
💻 Developer Notes
For developers looking to contribute, please refer to the Contributing
section of the official documentation.
🛠️ Supported Operating Systems
Scikit-longitudinal
is compatible with the following operating systems:
- MacOS
- Linux 🐧
- Windows via Docker only (Docker uses Linux containers) 🪟 (To try without but we haven't tested it)
🚀 Getting Started
To perform longitudinal analysis with Scikit-Longitudinal
, use the
LongitudinalDataset
class to prepare the dataset. To analyse your
data, use the LexicoGradientBoostingClassifier
(i.e. Gradient Boosting variant for Longitudinal Data) or another
available
estimator/preprocessor.
Following that, you can apply the popular fit, predict, prodict_proba, or transform
methods in the same way that Scikit-learn
does, as shown in the example below.
from scikit_longitudinal.data_preparation import LongitudinalDataset
from scikit_longitudinal.estimators.ensemble.lexicographical.lexico_gradient_boosting import LexicoGradientBoostingClassifier
dataset = LongitudinalDataset('./stroke.csv')
dataset.load_data_target_train_test_split(
target_column="class_stroke_wave_4",
)
# Pre-set or manually set your temporal dependencies
dataset.setup_features_group(input_data="Elsa")
model = LexicoGradientBoostingClassifier(
features_group=dataset.feature_groups(),
threshold_gain=0.00015
)
model.fit(dataset.X_train, dataset.y_train)
y_pred = model.predict(dataset.X_test)
# Classification report
print(classification_report(y_test, y_pred))
📝 How to Cite?
Paper has been submitted to a conference. In the meantime, for the repository, utilise the button top right corner of the repository "How to cite?", or open the following citation file: CITATION.cff.
🔐 License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for scikit_longitudinal-0.0.5.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 842f0bd138f6d831b028439683a43312d4a7f8c7b6599521411534984dc93716 |
|
MD5 | b4980a683bad8c9e73efc6221e0a6703 |
|
BLAKE2b-256 | 2f6305544789255234d5c7d1985b21b1eb26a7cf86719bc5cc115d70429a078b |
Hashes for scikit_longitudinal-0.0.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c45eb8cc795e375aa91cfbd40dbfe760dfcb40126c799d33375d337bb0aaaa61 |
|
MD5 | a2ba92255f6d60fbc439539c5ad8751a |
|
BLAKE2b-256 | 643387b64443c2754a2e955386c41b525636d94aeb17b59817705260f4bddda7 |