Rank feature importance across multiple ML models.

Project description

FeatRanker

Rank feature importance across multiple ML models using permutation importance.

FeatRanker trains a configurable set of scikit-learn, XGBoost, and CatBoost models on your data, computes permutation importance for every trained model, and returns per-model rankings plus an aggregated average ranking.

Item	Value
Package name	`featranker`
Import module	`featranker`
CLI command	`featranker`
Model config	`featranker/importance_config.yaml`
Default prep file	`featureCalc.py` (project root)

Installation
How It Works
Data Preparation
Quick Start
CLI Reference
Python API
Model Configuration
Available Models
Output Format
Troubleshooting

Installation

Install dependencies:

pip install -r requirements.txt

Install the package in editable (development) mode:

pip install -e .

Or install from PyPI:

pip install featranker

Requirements

Python ≥ 3.10
numpy, scikit-learn, pyyaml, tqdm, xgboost, lightgbm, catboost

How It Works

Load data — A user-defined prep class returns a feature dict with a "label" key.
Initialize models — Model definitions are read from importance_config.yaml and instantiated for the requested task and group.
Train models — Every initialized model is fitted on the feature matrix.
Rank features — Permutation importance is computed per model, and an overall average ranking is produced.

Data Preparation

Before running FeatRanker you need a prep class — a Python class with a _calc_features() method that returns your data as a dict.

Expected return format

{
    "feature_1": [v1, v2, v3, ...],
    "feature_2": [v1, v2, v3, ...],
    ...
    "label":     [y1, y2, y3, ...],
}

Every feature key maps to a list of numeric values.
All lists (including "label") must have the same length.
The "label" key is required.

Where to put it

Option A — Edit the default file (simplest)

Define your class in featureCalc.py at the project root. The default class name is prepFeature, but you can name it anything and select it with --prep-class.

Option B — Use a separate file (no reinstall needed)

Keep your prep logic in any Python file and point to it at runtime:

featranker --prep-file ./my_features.py --prep-class MyPrepClass --task clf

Example prep class

from sklearn.datasets import load_iris

class IrisPrep:
    def _calc_features(self):
        data = load_iris()
        features = {
            data.feature_names[i]: data.data[:, i].tolist()
            for i in range(data.data.shape[1])
        }
        features["label"] = data.target.tolist()
        return features

Quick Start

Implement _calc_features() in featureCalc.py (or your own file).
Run the CLI:

# Classification with all model families, using the default prepFeature class
featranker --task clf --group all

# Regression with tree models only, custom prep file and class
featranker --task reg --group tree \
    --prep-file ./my_features.py --prep-class DiabetesPrep

# Save results to a JSON file
featranker --task clf --group linear --output results

CLI Reference

featranker --task {clf,reg} [--group {linear,tree,all}]
           [--prep-file PATH] [--prep-class NAME]
           [--output PATH]

Flag	Description	Default
`--task`	`clf` (classification) or `reg` (regression)	required
`--group`	`linear`, `tree`, or `all` (both)	`all`
`--prep-class`	Name of the prep class to instantiate	`prepFeature`
`--prep-file`	Path to the Python file containing the prep class	`featureCalc.py` in the current working directory
`--output`	File path for JSON output (`.json` appended if missing)	print to stdout

Python API

Using `FeatureRanker` directly (default prep file)

When your default prepFeature class lives in featureCalc.py at the project root:

from featranker import FeatureRanker

ranker = FeatureRanker(task="clf", group="all")
results = ranker.rankFeatures()

Using `build_ranker` with a custom prep file

build_ranker is a convenience factory that returns a fully initialized FeatureRanker instance (features loaded, models trained, ready to rank):

from featranker import build_ranker

ranker = build_ranker(
    task="reg",
    group="tree",
    prep_file="./my_features.py",
    prep_class="DiabetesPrep",
)
results = ranker.rankFeatures()

Constructor parameters

Parameter	Type	Description
`task`	`"clf"` \| `"reg"`	Classification or regression
`group`	`"linear"` \| `"tree"` \| `"all"`	Which model family to use
`prep_file`	`str` or `None`	Path to prep file (defaults to `featureCalc.py`)
`prep_class`	`str`	Name of the prep class (defaults to `"prepFeature"`)

Model Configuration

Models are defined in featranker/importance_config.yaml, organized by task and group:

classification:
  linear:
    - name: logistic_regression
      import: sklearn.linear_model
      class: LogisticRegression
      params:
        max_iter: 2000
  tree:
    - name: random_forest
      import: sklearn.ensemble
      class: RandomForestClassifier
      params:
        random_state: 42

regression:
  linear:
    - ...
  tree:
    - ...

Each entry has four fields:

Field	Description
`name`	Display name used in output
`import`	Python module to import (e.g., `sklearn.ensemble`)
`class`	Class name to instantiate from that module
`params`	Dict of keyword arguments passed to the constructor (optional)

Edit this file to add, remove, or tune models. Changes take effect on the next run — no reinstall required.

Available Models

Classification — Linear

Name	Class
`logistic_regression`	`LogisticRegression`
`logistic_regression_l1`	`LogisticRegression` (L1)
`logistic_regression_l2`	`LogisticRegression` (L2)
`logistic_regression_elasticnet`	`LogisticRegression` (ElasticNet)
`linear_svm`	`LinearSVC`
`sgd_classifier`	`SGDClassifier`
`ridge_classifier`	`RidgeClassifier`
`perceptron`	`Perceptron`
`passive_aggressive`	`PassiveAggressiveClassifier`
`lda`	`LinearDiscriminantAnalysis`
`qda`	`QuadraticDiscriminantAnalysis`
`naive_bayes_gaussian`	`GaussianNB`
`naive_bayes_bernoulli`	`BernoulliNB`
`naive_bayes_multinomial`	`MultinomialNB`
`pls_da`	`PLSRegression`

Classification — Tree

Name	Class
`decision_tree`	`DecisionTreeClassifier`
`random_forest`	`RandomForestClassifier`
`extra_trees`	`ExtraTreesClassifier`
`bagging_tree`	`BaggingClassifier`
`adaboost`	`AdaBoostClassifier`
`gradient_boosting`	`GradientBoostingClassifier`
`hist_gradient_boosting`	`HistGradientBoostingClassifier`
`xgboost`	`XGBClassifier`
`catboost`	`CatBoostClassifier`

Regression — Linear

Name	Class
`linear_regression`	`LinearRegression`
`ridge_regression`	`Ridge`
`lasso_regression`	`Lasso`
`elasticnet_regression`	`ElasticNet`
`elasticnet_cv_regression`	`ElasticNetCV`
`pls_regression`	`PLSRegression`
`huber_regression`	`HuberRegressor`
`ransac_regression`	`RANSACRegressor`
`kernel_ridge_regression`	`KernelRidge`
`svr_regression`	`SVR`

Regression — Tree

Name	Class
`decision_tree_regressor`	`DecisionTreeRegressor`
`random_forest_regressor`	`RandomForestRegressor`
`extra_trees_regressor`	`ExtraTreesRegressor`
`adaboost_regressor`	`AdaBoostRegressor`
`gradient_boosting_regressor`	`GradientBoostingRegressor`
`hist_gradient_boosting_regressor`	`HistGradientBoostingRegressor`
`xgboost_regressor`	`XGBRegressor`
`catboost_regressor`	`CatBoostRegressor`

Output Format

The result is a dict (or JSON object) keyed by model name, with an additional "average" entry that aggregates across all models. Each value is a list of single-entry dicts sorted by score in descending order. Scores are rounded to four decimal places.

{
  "logistic_regression": [
    {"feature_a": 0.1234},
    {"feature_b": 0.0567},
    {"feature_c": 0.0012}
  ],
  "random_forest": [
    {"feature_b": 0.0890},
    {"feature_a": 0.0745},
    {"feature_c": 0.0023}
  ],
  "average": [
    {"feature_a": 0.0990},
    {"feature_b": 0.0729},
    {"feature_c": 0.0018}
  ]
}

Troubleshooting

Symptom	Cause	Fix
`Prep file not found`	FeatRanker can't locate `featureCalc.py`	Run the command from the directory that contains `featureCalc.py`, or pass an explicit path with `--prep-file`
`AttributeError: … has no attribute 'X'`	The prep class name doesn't match what's in the file	Check spelling of `--prep-class` against the class defined in your prep file
`'label' key missing`	`_calc_features()` didn't include a `"label"` entry	Add `features["label"] = ...` to your return dict
Feature length mismatch	Feature lists have different lengths	Ensure every feature list and `"label"` have the same number of elements
Model training errors (printed, not fatal)	A model failed to converge or doesn't support the data	Check the printed warning; consider removing or tuning that model in `importance_config.yaml`

Project details

Release history Release notifications | RSS feed

0.1.3

Feb 10, 2026

This version

0.1.2

Feb 10, 2026

0.1.1

Feb 6, 2026

0.1.0

Feb 6, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

featranker-0.1.2.tar.gz (13.6 kB view details)

Uploaded Feb 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

featranker-0.1.2-py3-none-any.whl (11.6 kB view details)

Uploaded Feb 10, 2026 Python 3

File details

Details for the file featranker-0.1.2.tar.gz.

File metadata

Download URL: featranker-0.1.2.tar.gz
Upload date: Feb 10, 2026
Size: 13.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for featranker-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`2177750b289ca3b7fb34a7ca1c2476973983ae46a61a06a1576a2852781d9684`
MD5	`f20a3e4d9b69eceb359f6e3a379c2615`
BLAKE2b-256	`01e7c5433c93e408aa9fb5607d5a5866582d9120df03596e570f4bb81eade6f8`

See more details on using hashes here.

File details

Details for the file featranker-0.1.2-py3-none-any.whl.

File metadata

Download URL: featranker-0.1.2-py3-none-any.whl
Upload date: Feb 10, 2026
Size: 11.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for featranker-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`796e2c6f8fb1426e96ddc4aecbf44293b4238140b755c43ec6930caeace92132`
MD5	`595d16ab382a6cfd9fdd6e8afaab76c1`
BLAKE2b-256	`06e66de5038bceb71ea9a6b4e292db15555948ed0ca603710977cd8307b0d715`

See more details on using hashes here.

featranker 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

FeatRanker

Table of Contents

Installation

Requirements

How It Works

Data Preparation

Expected return format

Where to put it

Example prep class

Quick Start

CLI Reference

Python API

Using FeatureRanker directly (default prep file)

Using build_ranker with a custom prep file

Constructor parameters

Model Configuration

Available Models

Classification — Linear

Classification — Tree

Regression — Linear

Regression — Tree

Output Format

Troubleshooting

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Using `FeatureRanker` directly (default prep file)

Using `build_ranker` with a custom prep file