Skip to main content

WoeBoost: Weight of Evidence (WOE) Gradient Boosting

Project description

WoeBoost

Author: xRiskLab
License: MIT License (2025)

Title

CI Free-threaded Compatibility PyPI Version License Contributors Issues Forks Stars

WoeBoost is a Python 🐍 package designed to bridge the gap between the predictive power of gradient boosting and the interpretability required in high-stakes domains such as finance, healthcare, and law. It introduces an interpretable, evidence-driven framework for scoring tasks, inspired by the principles of Weight of Evidence (WOE) and the ideas of Alan M. Turing.

🔑 Key Features

  • 🌟 Gradient Boosting with Explainability: Combines the strength of gradient boosting with the interpretability of WOE-based scoring systems.
  • 📊 Calibrated Scores: Produces well-calibrated scores essential for decision-making in regulated environments.
  • 🤖 AutoML-like Enhancements:
    • Infers monotonic relationships automatically (infer_monotonicity).
    • Supports early stopping for efficient training (enable_early_stopping).
  • 🔧 Support for Missing Values & Categorical Inputs: Handles various data types seamlessly while maintaining interpretability.
  • 🛠️ Diagnostic Toolkit:
    • Partial dependence plots.
    • Feature importance analysis.
    • Decision boundary visualization.
  • 📈 WOE Inference Maker: Provides classical WOE calculations and bin-level insights.

⚙️ How It Works

  1. 🔍 Initialization: Starts with prior log odds, representing baseline probabilities.
  2. 📈 Iterative Updates: Each boosting iteration calculates residual per each binned feature and sums residuals into total evidence (WOE), updating predictions.
  3. 🔗 Evidence Accumulation: Combines evidence from all iterations, producing a cumulative and interpretable scoring model.

🧐 Why WoeBoost?

  • 💡 Interpretability: Every model step adheres to principles familiar to risk managers and data scientists, ensuring transparency and trust.
  • ✅ Alignment with Regulatory Requirements: Calibrated and interpretable results meet the demands of high-stakes applications.
  • ⚡ Flexibility: Works seamlessly with diverse data types and supports concurrency for feature binning with Python's concurrent.futures.

Installation ⤵

Standard Installation

Install the package using pip:

pip install woeboost

Free-Threaded Python Support (Experimental)

For significant performance improvements with free-threaded Python builds:

# Install with free-threaded dependencies
pip install woeboost[freethreaded]

# Or install free-threaded Python first, then WoeBoost
uv python install 3.14.0a5+freethreaded
pip install woeboost[freethreaded]

Benefits of free-threaded Python:

  • 3.67× faster training - real measured performance improvement
  • Automatic thread optimization (8 threads vs 4 with GIL)
  • No code changes required - WoeBoost auto-detects free-threading
  • Same results, faster computation - identical convergence, 3.67× speedup
from woeboost import WoeLearner

# Automatically detects free-threading and optimizes thread count
learner = WoeLearner(n_tasks=8)  # Uses more tasks with free-threading
print(f"Free-threading detected: {learner.is_freethreaded}")

🧪 Free-Threaded Python Support

WoeBoost includes experimental support for free-threaded Python builds, providing significant performance improvements for CPU-bound operations:

  • 3.67× speedup for WoeBoost training with Python 3.14+freethreaded
  • Optimal performance at 8 threads (vs 4 with GIL)
  • Tested on Python 3.14.0a5+freethreaded (experimental builds)

Running Free-Threaded Tests

# Install free-threaded Python
uv python install 3.14.0a5+freethreaded

# Run free-threaded tests
./tests/run_freethreaded_tests.sh

See tests/README_FREETHREADED.md for detailed information.

💻 Example Usage

Below we provide two examples of using WoeBoost.

Training and Inference with WoeBoost classifier

from woeboost import WoeBoostClassifier

# Initialize the classifier
woe_model = WoeBoostClassifier(infer_monotonicity=True)

# Fit the model
woe_model.fit(X_train, y_train)

# Predict probabilities and scores
probas = woe_model.predict_proba(X_test)[:, 1]
preds = woe_model.predict(X_test)
scores = woe_model.predict_score(X_test)

Preparation of WOE inputs for logistic regression

from woeboost import WoeBoostClassifier

# Initialize the classifier
woe_model = WoeBoostClassifier(infer_monotonicity=True)

# Fit the model
woe_model.fit(X_train, y_train)

X_woe_train = woe_model.transform(X_train)
X_woe_test = woe_model.transform(X_test)

📚 Documentation

📝 Changelog

For a changelog, see CHANGELOG.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

woeboost-1.1.0.tar.gz (28.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

woeboost-1.1.0-py3-none-any.whl (26.6 kB view details)

Uploaded Python 3

File details

Details for the file woeboost-1.1.0.tar.gz.

File metadata

  • Download URL: woeboost-1.1.0.tar.gz
  • Upload date:
  • Size: 28.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.0

File hashes

Hashes for woeboost-1.1.0.tar.gz
Algorithm Hash digest
SHA256 42c70e74b8341e9e0d7c3ba650f4f664f452b663fc04a3bdb8b6ed55eb5ca9f5
MD5 e5d2f845fad0889f58ac0e28c5c0bf40
BLAKE2b-256 931e639ae665276782b37a2c9b26e90b739c175b314f8b39a47c62e53cc25330

See more details on using hashes here.

File details

Details for the file woeboost-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: woeboost-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 26.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.0

File hashes

Hashes for woeboost-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1f4be61d404c653e6b7851f37e5cb7e1e8cabd3a22bee47852fedaa736e02fbb
MD5 6fc252b3db98a3bfaeaecad1d68193a2
BLAKE2b-256 a0ec63782a2170223920fbe37ff7c0a7069c80aa57dbe6ae3e36ccc4d4f86888

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page