Skip to main content

Advanced regression methods with sklearn-like interface

Project description

Better Regressions

Advanced regression methods with an sklearn-like interface.

Current Features

  • Linear:
    • Configurable regularization: Ridge with given alpha / BayesianRidge / ARD
    • "Better bias" option to properly regularize the intercept term
  • AdaptiveLinear: Ridge regression with automatic shrinkage of features (like ARDRegression, but works in a different way and works better with correlated features)
  • Scaler:
    • Configurable preprocessing: Standard scaling (by second moment) / Quantile transformation with uniform/normal output / Power transformation
    • AutoScaler to automatically select the best scaling method based on validation split
  • Smooth: Boosting-based regression using smooth functions for features
    • SuperSmoother: Adaptive-span smoother for arbitrary complex functions.
    • Angle: Bagging of piecewise-linear functions, it's less flexible but because of that it's more robust to overfitting.
  • Soft: Mixture of regressors based on quantile classification
  • Stabilize: Robust scaling & clipping transformation for features/targets
  • AutoClassifier: Classification with automatic model selection (LogisticRegression or XGBoost, with auto depth selection)
  • BinnedRegression: Bins features and target, then trains a classifier. This way it can learn non-linear relationships and it also models the target distribution (not only its mean).
  • EDA: Exploratory Data Analysis utilities
    • plot_distribution: Visualize sample distributions with fitted t-distribution parameters
    • plot_trend: Automatically detect and visualize relationships between variables + Pearson/Spearman correlation
      • For discrete features: Shows violin plots with distribution at each value
      • For continuous features: Fits trend lines with variance estimation and confidence intervals

Installation

pip install better-regressions

Basic Usage

from better_regressions import auto_angle, auto_linear, Linear, Scaler, AutoClassifier
from better_regressions.eda import plot_distribution, plot_trend
from sklearn.datasets import make_regression, make_moons
import numpy as np

X, y = make_regression(n_samples=100, n_features=5, noise=0.1)
model = auto_angle(n_breakpoints=2)
model.fit(X, y)
y_pred = model.predict(X)
print(repr(model))

# Classification example
dataset = make_moons(n_samples=200, noise=0.3)
Xc, yc = dataset
clf = AutoClassifier(depth="auto")
clf.fit(Xc, yc)
yc_pred = clf.predict(Xc)

# EDA example
plot_distribution(y, name="Target Distribution")
plot_trend(X[:, 0], y, name="Feature 0 vs Target")

Building new verison

  1. Update __version__ in better_regressions/__init__.py and pyproject.toml
  2. python -m build
  3. python -m twine upload dist/*

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

better_regressions-0.9.0.tar.gz (374.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

better_regressions-0.9.0-py3-none-any.whl (33.9 kB view details)

Uploaded Python 3

File details

Details for the file better_regressions-0.9.0.tar.gz.

File metadata

  • Download URL: better_regressions-0.9.0.tar.gz
  • Upload date:
  • Size: 374.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for better_regressions-0.9.0.tar.gz
Algorithm Hash digest
SHA256 bb5b5a9078b0d6726f732af3f5f006e46acd32587442ad79914719f06bcb2ca5
MD5 166741bf075e80a022a91b37a7919154
BLAKE2b-256 d3fcdd2ed2c729c85b8bb969e5324bb3b73f5c26ee8d437916893c56e9b4b867

See more details on using hashes here.

File details

Details for the file better_regressions-0.9.0-py3-none-any.whl.

File metadata

File hashes

Hashes for better_regressions-0.9.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c3787cb5f72e36e77064e57702bf824da933ae8a5806c2eca2044b495709940a
MD5 9fbfa44d17b9d4df6cc22e18c2360ef7
BLAKE2b-256 8dc239fb370c1b6808b685429a8f8d7143a233c15f9085b224d518150d969232

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page