Advanced benchmarking for machine learning models.

These details have not been verified by PyPI

Project description

Benchmark-Adv-ML

Benchmark-Adv-ML is a Python package designed to facilitate advanced benchmarking and analysis of machine learning models. It provides comprehensive pipelines for model stability evaluation, autoencoder training, and survival clustering analysis, enabling users to evaluate model performance, generate predictions, and visualize results through various plots, including AUC curves, feature importance charts, and Kaplan-Meier survival plots.

Features
Installation
Usage
Command-Line Arguments
Dependencies
License
Author

Features

Model Stability Evaluation: Automatically runs multiple machine learning models (Logistic Regression, Support Vector Classifier, Random Forest Classifier) across multiple runs to assess stability and performance.
Autoencoder Training: Implements an autoencoder for dimensionality reduction and feature extraction, customizable with various hyperparameters.
Survival Clustering Analysis: Performs clustering on patient features and integrates clinical data to generate Kaplan-Meier survival plots and log-rank tests.
Prediction and Metrics Generation: Generates and saves predictions, feature importance scores, and various performance metrics for each model and run.
Aggregation of Results: Aggregates results across runs and models for comprehensive analysis, facilitating comparison and evaluation.
Visualization Tools: Generates plots including AUC curves, AUC box plots, feature importance charts, radar charts for model performance comparison, and survival analysis plots.

Installation

You can install the package directly from PyPI:

pip install benchmark-adv-ml

Alternatively, install from source:

git clone https://github.com/yourusername/benchmark-adv-ml.git
cd benchmark-adv-ml
pip install .

Useage

The package provides a command-line interface (CLI) for ease of use. Below are examples of how to use each component.

Download example data

wget https://github.com/VatsalPatel18/benchmark-adv-ml/blob/master/temp_data.csv

Benchmark Machine Learning Models

Run the benchmark ML pipeline to evaluate model stability across multiple runs.

benchmark-adv-ml benchmark --data ./your_dataset.csv --output ./final_results --prelim_output ./prelim_results --n_runs 10 --seed 42

Train Autoencoder Model

Train and evaluate an autoencoder model for feature extraction.

benchmark-adv-ml autoencoder --data ./your_dataset.csv --sampleID 'PatientID' --output_dir ./final_results --prelim_output ./prelim_results --latent_dim 10 --epochs 50 --batch_size 32 --validation_split 0.1 --test_size 0.2 --seed 42

Survival Clustering Analysis

benchmark-adv-ml survival_clustering --data_path ./latent_features.csv --clinical_df_path ./clinical_data.csv --save_dir ./final_results

Command-Line Arguments

Common Arguments

--data: Path to the existing CSV file containing the dataset.
--output: Directory to save the final results and plots.
--prelim_output: Directory to save the preliminary results (predictions).
--seed: Seed for random state (default is 42).

Benchmark Command Arguments

--target: Target column name in the dataset (default: 'label').
--n_runs: Number of runs for model stability evaluation (default: 20).

Autoencoder Command Arguments

--sampleID: Column name representing the sample or patient ID (default: 'sampleID').
--latent_dim: Dimensionality of the latent space (default: input_dim // 8).
--epochs: Number of training epochs (default: 50).
--batch_size: Training batch size (default: 32).
--validation_split: Proportion of training data to use as validation set (default: 0.1).
--test_size: Proportion of data to use as test set (default: 0.2).
--early_stopping: Enable early stopping (use flag to activate).
--patience: Patience for early stopping (default: 5).
--checkpoint: Enable model checkpointing (use flag to activate).

Survival Clustering Command Arguments

--data_path: Path to the CSV file containing patient features.
--clinical_df_path: Path to the CSV file containing clinical data.
--save_dir: Directory to save the results.

Dependencies

Python 3.11+
numpy
pandas
scikit-learn
matplotlib
seaborn
tensorflow
lifelines
yellowbrick

License

This project is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. See the LICENSE file for details.

Author

Vatsal Patel - VatsalPatel18

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.8

Dec 9, 2024

This version

0.2.7

Dec 9, 2024

0.2.6

Oct 17, 2024

0.2.5

Sep 13, 2024

0.2.2

Sep 13, 2024

0.2.0

Sep 12, 2024

0.1.24

Sep 12, 2024

0.1.23

Sep 12, 2024

0.1.22

Sep 12, 2024

0.1.21

Sep 12, 2024

0.1.20

Sep 12, 2024

0.1.19

Sep 12, 2024

0.1.18

Sep 12, 2024

0.1.17

Sep 12, 2024

0.1.16

Sep 12, 2024

0.1.15

Sep 12, 2024

0.1.14

Sep 12, 2024

0.1.13

Sep 12, 2024

0.1.12

Sep 12, 2024

0.1.11

Sep 11, 2024

0.1.10

Sep 11, 2024

0.1.9

Sep 11, 2024

0.1.8

Sep 11, 2024

0.1.7

Sep 11, 2024

0.1.6

Aug 21, 2024

0.1.4

Aug 21, 2024

0.1.2

Aug 21, 2024

0.1.1

Aug 19, 2024

0.1.0

Aug 19, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

benchmark_adv_ml-0.2.7.tar.gz (80.2 kB view details)

Uploaded Dec 9, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

benchmark_adv_ml-0.2.7-py3-none-any.whl (89.3 kB view details)

Uploaded Dec 9, 2024 Python 3

File details

Details for the file benchmark_adv_ml-0.2.7.tar.gz.

File metadata

Download URL: benchmark_adv_ml-0.2.7.tar.gz
Upload date: Dec 9, 2024
Size: 80.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.3 CPython/3.11.7 Linux/6.8.0-45-generic

File hashes

Hashes for benchmark_adv_ml-0.2.7.tar.gz
Algorithm	Hash digest
SHA256	`03f65b6217e87abdbdd70d03f257749735deca36c394a51a1f79487b60a3dcb2`
MD5	`7108c86fd58e9b7e16c4e3b2d0c57aae`
BLAKE2b-256	`5f66f058234c0a549abfd420c892a2b36800729b99eb28f99bcdb0d714f77442`

See more details on using hashes here.

File details

Details for the file benchmark_adv_ml-0.2.7-py3-none-any.whl.

File metadata

Download URL: benchmark_adv_ml-0.2.7-py3-none-any.whl
Upload date: Dec 9, 2024
Size: 89.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.3 CPython/3.11.7 Linux/6.8.0-45-generic

File hashes

Hashes for benchmark_adv_ml-0.2.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b2d790b9e8c0d311fcab5cb688ffc94f0d7c4cc4d54ebffcc9f8fb70b4d89162`
MD5	`f83be93d8f8966c559828a855a7b9743`
BLAKE2b-256	`25f244659af756f50e99209d383037d229c7c53fc48bdd2b0e80fc5cd6a11f19`

See more details on using hashes here.

benchmark-adv-ml 0.2.7

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Benchmark-Adv-ML

Table of Contents

Features

Installation

Useage

Download example data

Benchmark Machine Learning Models

Train Autoencoder Model

Survival Clustering Analysis

Command-Line Arguments

Common Arguments

Benchmark Command Arguments

Autoencoder Command Arguments

Survival Clustering Command Arguments

Dependencies

License

Author

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes