A lightweight, sklearn-compatible Python toolkit for tabular machine learning.
Project description
clinikit
Prepared by Berat Kaan SEVEN
A lightweight, sklearn-compatible Python toolkit for tabular machine
learning. clinikit bundles 14 hybrid classifiers, 5 experiment
protocols, calibration utilities, label-noise diagnostics, fairness
audits, and structured HTML reports behind a single drop-in package.
Research and development use only. This is an integration toolkit,
not a regulated product and not a research paper of original methods.
See CITATIONS.md for source-method references.
Why clinikit
clinikit is a complement to existing libraries, not a competitor.
| Library | Focus | Why clinikit is different |
|---|---|---|
| scikit-learn | General-purpose ML | Adds curated experiment protocols, audit utilities, and structured reporting |
| Cleanlab | Label noise only | Integrates Cleanlab plus neighborhood conflict and LOO into one diagnostics module |
| MAPIE | Conformal prediction only | Includes selective classification as one of 14 bundled models |
| Fairlearn / AIF360 | Fairness only | The audit module bundles fairness, leakage, and documentation helpers |
| AutoGluon | AutoML | Library-first; thin AutoML wrappers exist but no auto-magic by default |
| PyHealth | Deep learning for sequence / multimodal | Tabular-only, classical ML focused, lightweight |
Installation
pip install clinikit
Optional dependency groups:
pip install "clinikit[diagnostics]" # Cleanlab-based label-noise tools
pip install "clinikit[explain]" # SHAP and LIME wrappers
pip install "clinikit[automl]" # TabPFN, FLAML, AutoGluon wrappers
pip install "clinikit[synthetic]" # CTGAN / TVAE wrappers
pip install "clinikit[conformal]" # MAPIE conformal prediction
pip install "clinikit[all]" # Everything
Supported Python versions: 3.10, 3.11, 3.12, 3.13.
Quickstart
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from clinikit.datasets import load_pima
from clinikit.metrics import sensitivity, specificity
from clinikit.models import RuleAugmentedClassifier
X, y = load_pima(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42, stratify=y
)
model = RuleAugmentedClassifier(base_estimator=LogisticRegression(max_iter=1000))
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
print("Sensitivity:", sensitivity(y_test, y_pred))
print("Specificity:", specificity(y_test, y_pred))
For a complete walkthrough, see examples/quickstart.ipynb
or open it in
Colab.
What is in the box
14 hybrid classifiers (clinikit.models)
All sklearn-compatible, all pass sklearn.utils.estimator_checks.check_estimator.
RuleAugmentedClassifierBoundaryRefineClassifierSubgroupThresholdClassifierErrorAwareCalibratorMonotonicBoosterHardSampleWeightedEnsembleClassConditionalImputerCrossDistributionDistillerSelectiveClassifierInstanceAdaptiveThresholdDialecticalEnsembleLatentSubtypeRouterIterativeLabelRefinerDualViewCoTrainer
Supporting modules
preprocessing— imputers, scalers, outlier flags, missing indicatorsmetrics— sensitivity, specificity, NPV, PPV, F2, MCC, Brier, ECEcurves— ROC, PR, calibration, Decision Curve Analysisprotocols— 5 experiment protocols (Defensible, MaxScore, OriginalOnly, Deployment, Audit)leaderboard— experiment tracking CSV with 38 columnsreport— HTML structured report generator (Jinja2 templates)audit— leakage detection, subgroup fairness, documentation checksgovernance— audit-trail manifest templates (documentation only)reproducibility— manifest files (data hash + config + library versions)datasets— UCI benchmarks (PIMA, Wisconsin, UCI Heart, Frankfurt)cli— Typer-based CLI:train,benchmark,audit,validate,reportthresholds,calibration,statistics,diagnostics,cost_sensitive,monitor,modelcard,cross_val,explainability,automl,external_val,time_split,active_learning,synthetic
Command-line interface
clinikit train --config config.yaml
clinikit benchmark --dataset pima --models all
clinikit audit --data data.csv --report audit.html
clinikit validate --model model.joblib --data data.csv
clinikit report --leaderboard runs.csv --out report.html
Project notes
clinikit is an integration toolkit. The methods it bundles are
adaptations of techniques published in the academic literature; see
CITATIONS.md for source-method references. It is
not a research paper of original methods, and it is not a regulated
product. Research and development use only.
Contributing
Contributions are welcome. Please read CONTRIBUTING.md
for the development workflow, coding standards, and pull-request
process. By participating, you agree to abide by the
Code of Conduct.
Citation
If you use clinikit in academic work, please cite it via the
CITATION.cff file, or use:
@software{clinikit,
author = {SEVEN, Berat Kaan},
title = {clinikit: a tabular machine-learning toolkit},
year = {2026},
url = {https://github.com/clinikit/clinikit},
version = {0.1.0}
}
License
Distributed under the MIT License. See LICENSE for the full text.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file clinikit-0.1.1.tar.gz.
File metadata
- Download URL: clinikit-0.1.1.tar.gz
- Upload date:
- Size: 194.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
800d6176f13f3e881bd96788366f249314d25fa2ddc01338b09b499f916898e9
|
|
| MD5 |
fdf5d86b4f6de9c7d30d0e972bf677db
|
|
| BLAKE2b-256 |
47026354e6a83efa5a3b706173a0c7fac95f64106e986993f2469406b8ad3243
|
Provenance
The following attestation bundles were made for clinikit-0.1.1.tar.gz:
Publisher:
publish.yml on clinikit/clinikit
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
clinikit-0.1.1.tar.gz -
Subject digest:
800d6176f13f3e881bd96788366f249314d25fa2ddc01338b09b499f916898e9 - Sigstore transparency entry: 1563439819
- Sigstore integration time:
-
Permalink:
clinikit/clinikit@20f80ee8eabd213d299c6b1ee42899deb421436c -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/clinikit
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@20f80ee8eabd213d299c6b1ee42899deb421436c -
Trigger Event:
release
-
Statement type:
File details
Details for the file clinikit-0.1.1-py3-none-any.whl.
File metadata
- Download URL: clinikit-0.1.1-py3-none-any.whl
- Upload date:
- Size: 200.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aca674a68e3bf79e8fc47be093d7c2f1627baa34cc34a4224fb7a4f18916c84a
|
|
| MD5 |
63755d96d22c2e06bb5587b1f0acc2fb
|
|
| BLAKE2b-256 |
8a906b5978ad46bdd7cf95f78d839467ec311e1d2e0c3de29729fcab157f69f6
|
Provenance
The following attestation bundles were made for clinikit-0.1.1-py3-none-any.whl:
Publisher:
publish.yml on clinikit/clinikit
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
clinikit-0.1.1-py3-none-any.whl -
Subject digest:
aca674a68e3bf79e8fc47be093d7c2f1627baa34cc34a4224fb7a4f18916c84a - Sigstore transparency entry: 1563439887
- Sigstore integration time:
-
Permalink:
clinikit/clinikit@20f80ee8eabd213d299c6b1ee42899deb421436c -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/clinikit
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@20f80ee8eabd213d299c6b1ee42899deb421436c -
Trigger Event:
release
-
Statement type: