Temporal-aware Boruta feature selection for quantitative finance. OOS-only importance with purged cross-validation.

These details have not been verified by PyPI

Project links

Project description

boruta-quant

Temporal-aware Boruta feature selection for quantitative finance.

boruta-quant computes feature importance on validation data only, using purged cross-validation to prevent lookahead bias. Built for financial time series where temporal integrity matters.

Why boruta-quant?

Standard feature selection (SHAP, sklearn permutation importance) computes importance on training data. In financial time series, this leaks future information into feature rankings. boruta-quant fixes this:

OOS-Only Importance: Importance computed exclusively on validation folds
Purged Cross-Validation: Train/test gap with purge and embargo windows
Shadow Features: Boruta's all-relevant selection via shadow comparison

Installation

# Basic (permutation importance only)
pip install boruta-quant

# With LightGBM
pip install boruta-quant[lightgbm]

# With SHAP support
pip install boruta-quant[shap]

# Everything
pip install boruta-quant[all]

Development

git clone https://github.com/BlackArbsCEO/boruta-quant.git
cd boruta-quant
uv sync --all-extras --dev

Quick Start

from boruta_quant import BorutaSelector, BorutaSelectorConfig
from boruta_quant.oracle import PermutationImportanceOracle
from boruta_quant.temporal import PurgedTemporalCV, PurgedCVConfig
from boruta_quant.metrics import rank_ic_scorer
from lightgbm import LGBMRegressor

# 1. Configure purged temporal CV
cv = PurgedTemporalCV(PurgedCVConfig(
    n_splits=5,
    purge_window_days=5,      # gap before validation fold
    embargo_window_days=5,    # gap after validation fold
    min_train_size=100,
    test_size_ratio=0.2,
))

# 2. Configure importance oracle (OOS-only)
oracle = PermutationImportanceOracle(
    scoring=rank_ic_scorer,   # Spearman rank correlation
    n_repeats=10,
    random_state=42,
)

# 3. Configure Boruta selector
selector = BorutaSelector(
    config=BorutaSelectorConfig(
        n_trials=20,          # Boruta iterations
        percentile=100,       # shadow threshold percentile
        alpha=0.05,           # significance level
        two_step=True,        # resolve tentative features
        random_state=42,
    ),
    oracle=oracle,
    cv=cv,
)

# 4. Fit — model goes here, not in the constructor
result = selector.fit(
    X, y,
    timestamps=timestamps,   # must be timezone-aware
    model=LGBMRegressor(n_estimators=100, random_state=42),
)

# 5. Results
print(result.accepted_features)    # confirmed important
print(result.rejected_features)    # confirmed unimportant
print(result.tentative_features)   # borderline (resolved if two_step=True)

Importance Oracles

All oracles fit the model on training data but measure importance on validation data only.

Oracle	How it works	When to use
`PermutationImportanceOracle`	Shuffles one feature in validation set, measures prediction drop	Default — reliable, no refit needed
`DropColumnImportanceOracle`	Removes feature, refits model, measures prediction drop	When refit cost is acceptable
`BlockPermutationImportanceOracle`	Block-shuffles feature (preserves autocorrelation structure)	Autocorrelated time series

Temporal Cross-Validation

     Training         Purge   Validation   Embargo
  |--------------| |-------| |----------| |-------|
  ^                                                ^
  train_start                               embargo_end

- Purge: removes observations that could leak into validation
- Embargo: prevents information from validation bleeding forward

Shadow Shuffle Modes

Shadow features are shuffled copies of real features. The shuffle mode controls how temporal structure is handled:

Mode	Description	Use case
`ShuffleMode.RANDOM`	Standard i.i.d. permutation	Default — i.i.d. data
`ShuffleMode.BLOCK`	Block-preserving shuffle	Autocorrelated features
`ShuffleMode.ERA`	Shuffle within eras only	Regime-aware selection

Metrics

Function	Description
`rank_ic`	Spearman correlation between predictions and actuals
`rank_ic_scorer`	sklearn-compatible scorer wrapping `rank_ic`
`directional_accuracy`	Fraction of correct sign predictions (up vs down)
`directional_accuracy_scorer`	sklearn-compatible scorer wrapping `directional_accuracy`
`auc_score`	Area under ROC curve
`auc_scorer`	sklearn-compatible scorer wrapping `auc_score`

Design Principles

OOS-Only: Importance never computed on training data
Fail-Fast: Invalid temporal data (naive timestamps, unsorted) raises immediately
Type-Safe: Runtime enforcement with beartype, strict Pyright
Explicit: All config parameters required — no hidden defaults

References

Boruta Algorithm — Kursa & Rudnicki (2010)
Advances in Financial Machine Learning — Lopez de Prado (2018), Ch. 7 (purged CV)

License

MIT — see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Feb 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

boruta_quant-0.1.0.tar.gz (49.9 kB view details)

Uploaded Feb 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

boruta_quant-0.1.0-py3-none-any.whl (34.8 kB view details)

Uploaded Feb 27, 2026 Python 3

File details

Details for the file boruta_quant-0.1.0.tar.gz.

File metadata

Download URL: boruta_quant-0.1.0.tar.gz
Upload date: Feb 27, 2026
Size: 49.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.23

File hashes

Hashes for boruta_quant-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`74d086a38157c77eb0f43856ea838cfa364b6f807b87b83e3876edc6c11d6897`
MD5	`56bdd14c2eb93cd1905dd43f2ca414d8`
BLAKE2b-256	`76b25791ca7862c4f203c4638b061565b90f22035071ac64cd05062f137eeeae`

See more details on using hashes here.

File details

Details for the file boruta_quant-0.1.0-py3-none-any.whl.

File metadata

Download URL: boruta_quant-0.1.0-py3-none-any.whl
Upload date: Feb 27, 2026
Size: 34.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.23

File hashes

Hashes for boruta_quant-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8375c8b6c016f554337b15322b07815371e734cae2a871fdb32adc7fd30a3ae9`
MD5	`43ed49e45be61a7f168f1c9f6cfae350`
BLAKE2b-256	`55066b9dd0d970e05963bff6ac7963dad66c0f0040d1ed66221e5d9c01301887`

See more details on using hashes here.

boruta-quant 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

boruta-quant

Why boruta-quant?

Installation

Development

Quick Start

Importance Oracles

Temporal Cross-Validation

Shadow Shuffle Modes

Metrics

Design Principles

References

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes