Relational features, adaptive search, and confidence evaluation tools for scikit-learn
Project description
skboost
skboost is a lightweight Python library designed to boost your models — whether trees, linear models, neural networks, or anything else. It provides relational feature engineering, adaptive hyperparameter search, and confidence-aware evaluation tools that enhance model performance and interpretability.
Installation
pip install skboost
Features
Preprocessing
RelationalFeaturesTransformer - Boost model performance with compact relational features
from skboost.preprocessing import RelationalFeaturesTransformer
transformer = RelationalFeaturesTransformer(direction='larger')
X_transformed = transformer.fit_transform(X)
# For each feature, finds indices and distances to next larger/smaller values
# Adds O(N) features instead of O(N²) pairwise combinations
# Helps models learn inter-feature relationships
# Useful for: computer vision, ranking tasks, any data with meaningful ordering
DualScalerTransformer - Boost classification with class-specific scaling
from skboost.preprocessing import DualScalerTransformer
transformer = DualScalerTransformer()
X_scaled = transformer.fit_transform(X)
# Scales features separately for different target classes
# Useful for: imbalanced datasets, multi-class problems with different distributions
Hyperparameter Tuning
zoom_search_cv - Boost efficiency with adaptive hyperparameter search
from skboost.tuning import zoom_search_cv
best_params, best_score = zoom_search_cv(
estimator, X, y,
param_grid={'n_estimators': [50, 100, 150], 'max_depth': [3, 6, 9]},
n_iter=3, cv=5
)
# Starts with 3 values per parameter, iteratively zooms around best region
# Works for both numeric and categorical hyperparameters
# Useful for: faster optimization than exhaustive grid search
Model Evaluation
confidence_report - Boost reliability with confidence-aware evaluation
from skboost.evaluation import confidence_report, plot_confidence_report
reports = confidence_report(y_true, y_proba, thresholds=[0.5, 0.7, 0.9])
plot_confidence_report(reports)
# Shows precision/recall/f1 at different confidence thresholds per class
# Visualize which predictions your model is reliable on
# Useful for: production deployment decisions, finding usable subsets, model monitoring
Additional Tools
GroupDiffTransformer - Sequential feature engineering within groups
from skboost.preprocessing import GroupDiffTransformer
transformer = GroupDiffTransformer(key_col='user_id')
X_transformed = transformer.fit_transform(X)
# Adds: difference from previous row, difference from first row per group
GroupValueCountsTransformer - Value frequency features within groups
from skboost.preprocessing import GroupValueCountsTransformer
transformer = GroupValueCountsTransformer(group_col='session_id', value_col='action')
X_transformed = transformer.fit_transform(X)
# Adds: raw counts and normalized counts per group
Quick Example
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
from skboost.preprocessing import RelationalFeaturesTransformer
from skboost.tuning import zoom_search_cv
from skboost.evaluation import confidence_report, plot_confidence_report
# Generate data
X, y = make_classification(n_samples=500, n_classes=3, n_informative=5, random_state=42)
# Add relational features
transformer = RelationalFeaturesTransformer(direction='larger')
X_boosted = transformer.fit_transform(X)
# Adaptive hyperparameter search
param_grid = {'n_estimators': [50, 100, 150], 'max_depth': [3, 6, 9]}
clf = RandomForestClassifier(random_state=42)
best_params, best_score = zoom_search_cv(clf, X_boosted, y, param_grid, n_iter=3)
# Train with best params and evaluate confidence
clf.set_params(**best_params)
clf.fit(X_boosted, y)
y_proba = clf.predict_proba(X_boosted)
# Confidence-stratified evaluation
reports = confidence_report(y, y_proba, thresholds=[0.5, 0.7, 0.9])
plot_confidence_report(reports)
Testing
pytest tests/
See tests/ directory for usage examples in test form.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file skboost-0.2.0.tar.gz.
File metadata
- Download URL: skboost-0.2.0.tar.gz
- Upload date:
- Size: 12.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0289cb142d00a6121ff59ad78da0cacd7c9e1e4e119154c5cf8ef32152cbbbfa
|
|
| MD5 |
70d1047a96d1be639451254ec94f361b
|
|
| BLAKE2b-256 |
5d7b1d3701cecfefb422babc102fab251b0f714b4d62b2fdc96c4d0b42859be3
|
File details
Details for the file skboost-0.2.0-py3-none-any.whl.
File metadata
- Download URL: skboost-0.2.0-py3-none-any.whl
- Upload date:
- Size: 11.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f2415399ff26ab79f7f18bd05264511089c6764c9815489c4be85a679664c92f
|
|
| MD5 |
a9e140f638d851001a4c82305d507aa4
|
|
| BLAKE2b-256 |
452821e6c18fc4117e76f6c29d27a1204216206e91c1d211bfc0858f8d4cca96
|