Skip to main content

README.md

Project description

Documentation Status

This module provides quantile machine learning models for python, in a plug-and-play fashion in the sklearn environment. This means that practically the only dependency is sklearn and all its functionality is applicable to the here provided models without code changes.

The models implemented here share the trait that they are trained in exactly the same way as their non-quantile counterpart. The quantile information is only used in the prediction phase. The advantage of this (over for example Gradient Boosting Quantile Regression) is that several quantiles can be predicted at once without the need for retraining the model, which overall leads to a significantly faster workflow. Note that accuracy of doing this depends on the data. As can be seen in the example in the documentation: with certain data characteristics different quantiles might require different parameter optimisation for optimal performance. This is obviously possible with the implemented models here, but this requires the use of a single quantile during prediction, thus losing the speed advantage described above.

For guidance see docs (through the link in the badge). They include an example that for quantile regression forests in exactly the same template as used for Gradient Boosting Quantile Regression in sklearn for comparability.

Implemented:

  • Random Forest Quantile Regression

    • RandomForestQuantileRegressor: the main implementation
    • SampleRandomForestQuantileRegressor: an approximation, that is much faster than the main implementation.
    • RandomForestMaximumRegressor: mathematically equivalent to the main implementation but much faster.
  • Extra Trees Quantile Regression

    • ExtraTreesQuantileRegressor: the main implementation
    • SampleExtraTreesQuantileRegressor: an approximation, that is much faster than the main implementation.
  • Quantile K-nearest neighbors (KNeighborsQuantileRegressor)

Installation

The package can be installed with conda:

conda install --channel conda-forge sklearn-quantile

Example

An example of Random Forest Quantile Regression in action (both the main implementation and its approximation):

Usage example

Random Forest Quantile Regressor predicting the 5th, 50th and 95th percentile of the California housing dataset.

from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn_quantile import RandomForestQuantileRegressor

X, y = fetch_california_housing(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.5, random_state=0)

qrf = RandomForestQuantileRegressor(q=[0.05, 0.50, 0.95])
qrf.fit(X_train, y_train)

y_pred_5, y_pred_median, y_pred_95 = qrf.predict(X_test)
qrf.score(X_test, y_test)

Important links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sklearn-quantile-0.0.26.tar.gz (24.3 kB view details)

Uploaded Source

File details

Details for the file sklearn-quantile-0.0.26.tar.gz.

File metadata

  • Download URL: sklearn-quantile-0.0.26.tar.gz
  • Upload date:
  • Size: 24.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for sklearn-quantile-0.0.26.tar.gz
Algorithm Hash digest
SHA256 dd2241b0088dbf35290b6014fb6ce2e548db8db37c6c1deb71ec04560a7297e9
MD5 e106b89900625d4e0d4ddbe74a39b932
BLAKE2b-256 64118eb0758480f8d9068e4140a5515268aef98474ec1c5c280736b0cd891797

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page