Skip to main content

SHAP-based recursive feature elimination with cross-validation and early stopping

Project description

recursive-pietro

SHAP-based recursive feature elimination with cross-validation and early stopping.

Drop-in sklearn-compatible replacement for Probatus ShapRFECV — faster, cleaner, and works in pipelines.

Installation

pip install recursive-pietro

With optional boosting-library support:

pip install recursive-pietro[lightgbm]   # LightGBM early stopping
pip install recursive-pietro[xgboost]    # XGBoost early stopping
pip install recursive-pietro[catboost]   # CatBoost early stopping
pip install recursive-pietro[plot]       # matplotlib plotting
pip install recursive-pietro[all]        # everything

Quick start

from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
from recursive_pietro import ShapFeatureElimination

X, y = make_classification(n_samples=200, n_features=15, n_informative=5, random_state=42)

selector = ShapFeatureElimination(
    RandomForestClassifier(n_estimators=50, random_state=42),
    step=0.2,
    cv=3,
    scoring="roc_auc",
    random_state=42,
)

selector.fit(X, y)

# Selected features
print(selector.selected_features_)

# Use in transform
X_reduced = selector.transform(X)

Early stopping (LightGBM / XGBoost / CatBoost)

from lightgbm import LGBMClassifier

selector = ShapFeatureElimination(
    LGBMClassifier(n_estimators=500, random_state=42),
    step=0.2,
    cv=5,
    scoring="roc_auc",
    early_stopping_rounds=50,
    eval_metric="auc",
)

selector.fit(X, y)

Feature set selection strategies

After fitting, choose different feature sets from the elimination report:

selector.get_feature_set(method="best")              # highest validation score
selector.get_feature_set(method="best_parsimonious")  # fewest features within threshold
selector.get_feature_set(method="best_coherent")      # lowest std within threshold
selector.get_feature_set(method=10)                   # exactly 10 features

sklearn pipeline support

from sklearn.pipeline import Pipeline
from sklearn.linear_model import LogisticRegression

pipe = Pipeline([
    ("feature_selection", ShapFeatureElimination(
        RandomForestClassifier(n_estimators=50, random_state=42),
        step=1, cv=3, scoring="roc_auc",
    )),
    ("classifier", LogisticRegression()),
])

pipe.fit(X, y)

Plotting

pip install recursive-pietro[plot]
selector.plot()

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

recursive_pietro-0.2.0.tar.gz (20.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

recursive_pietro-0.2.0-py3-none-any.whl (16.0 kB view details)

Uploaded Python 3

File details

Details for the file recursive_pietro-0.2.0.tar.gz.

File metadata

  • Download URL: recursive_pietro-0.2.0.tar.gz
  • Upload date:
  • Size: 20.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for recursive_pietro-0.2.0.tar.gz
Algorithm Hash digest
SHA256 e01ac885fa9fd86d078651555915c079e62a57e7dc2306db83ac19506c91f984
MD5 581d469b88a02e1eb3b4bd352dd68ccb
BLAKE2b-256 4372d4b02b3d5621d1491072bc4a9c24a601909eefc37b30d5e10a3ef4743880

See more details on using hashes here.

Provenance

The following attestation bundles were made for recursive_pietro-0.2.0.tar.gz:

Publisher: publish.yml on ReinierKoops/Recursive_pietro

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file recursive_pietro-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for recursive_pietro-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5f875704f91a3f5a5fc3a9b0f97816edee9044e78ddd0f394cce29f4330a1c7e
MD5 c7b9019e61472b7e4fedf34ab4ac296d
BLAKE2b-256 2f04df42c8c6dd94ae69ae5584e2051cf9ef0df905c6ed0e9b8b05b0ab95ab4d

See more details on using hashes here.

Provenance

The following attestation bundles were made for recursive_pietro-0.2.0-py3-none-any.whl:

Publisher: publish.yml on ReinierKoops/Recursive_pietro

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page