Dynamic recursive feature elimination utilities built on scikit-learn.
Project description
dRFEtools - dynamic Recursive Feature Elimination
dRFEtools is a package for dynamic recursive feature elimination with
scikit-learn.
Authors: Apuã Paquola, Kynon Jade Benjamin, and Tarun Katipalli
Package developed in Python 3.11+.
In addition to scikit-learn, dRFEtools is also built with NumPy, SciPy,
Pandas, matplotlib, plotnine, and statsmodels. Currently, dynamic RFE supports
models with coef_ or feature_importances_ attribute.
This package provides several functions to run dynamic recursive feature elimination (dRFE) for random forest and linear model classifier and regression models. For random forest workflows, dRFEtools assumes Out-of-Bag (OOB) scoring is enabled. Linear-model workflows build a developmental split internally. For both classification and regression, three measurements are calculated for feature selection:
Classification:
- Normalized mutual information
- Accuracy
- Area under the curve (AUC) ROC curve
Regression:
- R2 (this can be negative if model is arbitrarily worse)
- Explained variance
- Mean squared error
Package structure
The repository is organized into focused modules to match the runtime architecture:
dRFEtools.py– core interfaces for random-forest and developmental-set elimination workflows.scoring/– metric implementations for developmental splits and random-forest OOB scoring.lowess/– helpers for smoothing elimination curves and extracting optimal feature counts.metrics/– feature ranking utilities used during elimination.plotting.py– visualization helpers re-exported from the top-level package.cli.py– command-line entry points for running full dRFE pipelines.utils.py– shared helpers for normalizing results and persisting plots.
Table of Contents
Citation
If using please cite the following:
Kynon J M Benjamin, Tarun Katipalli, Apuã C M Paquola, dRFEtools: dynamic recursive feature elimination for omics, Bioinformatics, Volume 39, Issue 8, August 2023, btad513, https://doi.org/10.1093/bioinformatics/btad513
PMID: 37632789
DOI: 10.1093/bioinformatics/btad513.
Installation
pip install --user dRFEtools
Tutorials
We have two tutorials for optimization and classification that align with the 0.4.x API documented on Read the Docs.
In addition to this, we have example code used in the manuscript for scikit-learn simulation, biological simulation, and BrainSEQ Phase 1 at the link below.
https://github.com/LieberInstitute/dRFEtools_manuscript
Reference Manual
Core elimination functions
rf_rfe– Runs random-forest feature elimination and returns a pair of standardized dictionaries: the full history keyed by feature count and the first elimination step. Each entry containsn_features, a metrics mapping appropriate for the task, the original indices, and the indices of surviving features.dev_rfe– Performs the same elimination loop for estimators that rely on a developmental split, yielding the same standardized result structure asrf_rfe.
Ranking and scoring utilities
features_rank_fnc– Ranks features during elimination and optionally persists the ranking table for each fold.- Developmental-set metrics (
dev_score_*) live underdRFEtools.scoring.dev. - Random-forest OOB metrics (
oob_score_*) live underdRFEtools.scoring.random_forest.
LOWESS helpers
extract_max_lowess– Identifies the optimal feature count from the LOWESS-smoothed elimination curve.extract_peripheral_lowess– Detects the inflection point associated with peripheral features.optimize_lowess_plot– Visualizes the LOWESS curve with annotations about the selected feature counts.
Plotting functions
Plotting helpers are defined in dRFEtools.plotting and re-exported from the
top-level package:
plot_metric– Render elimination trajectories for individual metrics.plot_with_lowess_vline– Overlay LOWESS-derived selection cutoffs on the metric trajectory plot.
Utilities and CLI
normalize_rfe_result,get_feature_importances, andsave_plot_variantsare available underdRFEtools.utilsand support the standardized dictionary-based API.ensure_pathis provided to safely normalize user-supplied file paths.- The command-line interface in
dRFEtools.cliwraps the same workflows for CSV inputs: runpython -m dRFEtools.cli --helpto explore available commands, including custom metric selection and development-split sizing.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file drfetools-0.4.0.tar.gz.
File metadata
- Download URL: drfetools-0.4.0.tar.gz
- Upload date:
- Size: 27.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.2.1 CPython/3.11.11 Linux/5.14.0-570.52.1.el9_6.x86_64
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
56a33c791f3d0c7ea93e815f48609ed0bbe313cada6da64f82161f8a56424022
|
|
| MD5 |
5c8a0461370fe514df5dbca4a149395b
|
|
| BLAKE2b-256 |
ac86f039733af28ac51b03ef15ea82fe9a0a8798f1546aed6af3f7b2cac3d92c
|
File details
Details for the file drfetools-0.4.0-py3-none-any.whl.
File metadata
- Download URL: drfetools-0.4.0-py3-none-any.whl
- Upload date:
- Size: 31.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.2.1 CPython/3.11.11 Linux/5.14.0-570.52.1.el9_6.x86_64
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f1ac628e5866c5e01654eaeca4cd64b542bcbf77d50323478f390fd1a82da4a2
|
|
| MD5 |
6ae7fa92a34f05be97b5df1e5396ecab
|
|
| BLAKE2b-256 |
f3873a9e7599d00f9b29a240b3c4851d5aa7bbcddf55f60292b533366f844055
|