Skip to main content

SHAP-based recursive feature elimination with cross-validation and early stopping

Project description

recursive-pietro

SHAP-based recursive feature elimination with cross-validation and early stopping.

Drop-in sklearn-compatible replacement for Probatus ShapRFECV — faster, cleaner, and works in pipelines.

Installation

pip install recursive-pietro

With optional boosting-library support:

pip install recursive-pietro[lightgbm]   # LightGBM early stopping
pip install recursive-pietro[xgboost]    # XGBoost early stopping
pip install recursive-pietro[catboost]   # CatBoost early stopping
pip install recursive-pietro[plot]       # matplotlib plotting
pip install recursive-pietro[all]        # everything

Quick start

from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
from recursive_pietro import ShapFeatureElimination

X, y = make_classification(n_samples=200, n_features=15, n_informative=5, random_state=42)

selector = ShapFeatureElimination(
    RandomForestClassifier(n_estimators=50, random_state=42),
    step=0.2,
    cv=3,
    scoring="roc_auc",
    random_state=42,
)

selector.fit(X, y)

# Selected features
print(selector.selected_features_)

# Use in transform
X_reduced = selector.transform(X)

Early stopping (LightGBM / XGBoost / CatBoost)

from lightgbm import LGBMClassifier

selector = ShapFeatureElimination(
    LGBMClassifier(n_estimators=500, random_state=42),
    step=0.2,
    cv=5,
    scoring="roc_auc",
    early_stopping_rounds=50,
    eval_metric="auc",
)

selector.fit(X, y)

Feature set selection strategies

After fitting, choose different feature sets from the elimination report:

selector.get_feature_set(method="best")              # highest validation score
selector.get_feature_set(method="best_parsimonious")  # fewest features within threshold
selector.get_feature_set(method="best_coherent")      # lowest std within threshold
selector.get_feature_set(method=10)                   # exactly 10 features

sklearn pipeline support

from sklearn.pipeline import Pipeline
from sklearn.linear_model import LogisticRegression

pipe = Pipeline([
    ("feature_selection", ShapFeatureElimination(
        RandomForestClassifier(n_estimators=50, random_state=42),
        step=1, cv=3, scoring="roc_auc",
    )),
    ("classifier", LogisticRegression()),
])

pipe.fit(X, y)

Plotting

pip install recursive-pietro[plot]
selector.plot()

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

recursive_pietro-0.1.0.tar.gz (20.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

recursive_pietro-0.1.0-py3-none-any.whl (15.8 kB view details)

Uploaded Python 3

File details

Details for the file recursive_pietro-0.1.0.tar.gz.

File metadata

  • Download URL: recursive_pietro-0.1.0.tar.gz
  • Upload date:
  • Size: 20.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for recursive_pietro-0.1.0.tar.gz
Algorithm Hash digest
SHA256 a41bac8adc04adfb02c938066605126571c2a51f5745a4de779dff7cd5c368cf
MD5 c742d227e5273e830e0017a4d98d8281
BLAKE2b-256 1b17468d4cc9925ad8d0a38b2a40566443dea33df1d0dc4c9da2b927bfc07b12

See more details on using hashes here.

Provenance

The following attestation bundles were made for recursive_pietro-0.1.0.tar.gz:

Publisher: publish.yml on ReinierKoops/Recursive_pietro

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file recursive_pietro-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for recursive_pietro-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a0f9ef1275e6be1789740f2abae509ad2636a52d37e1b4e477c731048e94c30b
MD5 8fa227149af8ddd61b3883ea367b835b
BLAKE2b-256 68b46a70504c4204cb1a048b71cc9802e471a07306d417b6061e8689a5fac02e

See more details on using hashes here.

Provenance

The following attestation bundles were made for recursive_pietro-0.1.0-py3-none-any.whl:

Publisher: publish.yml on ReinierKoops/Recursive_pietro

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page