Skip to main content

Machine learning framework built on second-order optimization

Project description

peak-engines

PyPI version License: CC BY 4.0 API Reference

peak-engines is machine learning framework that focuses on applying advanced optimization algorithms to build better models.

Installation

pip install peak-engines

Fit Logistic Regression Hyperparameters

The leave-one-out cross-validation of logistic regression can be efficiently approximated. At a high level, this is how it works: For given hyperparameters C, we

  1. Find the parameters b that optimize logistic regression for the given C.
  2. For each data index i, we compute the hessian H_{-i} and gradient g_{-i} of the log-likelihood with the ith data entry removed. (We can reuse the hessian computed from (1) to do this with a minimal amount of work.)
  3. We apply the Matrix Inversion Lemma to efficiently compute the inverse H_{-i}^{-1}.
  4. We use H_{-i}^{-1} and g_{-i} to take a single step of Newton's method to approximate the logistic regression coefficients with the ith entry removed b_{-i}.
  5. And finally, we used the b_{-i}'s to approximate the out-of-sample predictions and estimate the leave-one-out cross-validation.

See the paper A scalable estimate of the out-of-sample prediction error via approximate leave-one-out by Kamiar Rad and Arian Maleki for more details.

We can, furthermore, differentiate the Approximate Leave-One-Out metric with respect to the hyperparameters and quickly climb to the best performing C. Here's how to do it with peak-engines:

Load an example dataset

from sklearn.datasets import load_breast_cancer
from sklearn.preprocessing import StandardScaler
X, y = load_breast_cancer(return_X_y=True)
X = StandardScaler().fit_transform(X)

Find the best performing C

model = peak_engines.LogisticRegressionModel()
model.fit(X, y)
print('C =', model.C_[0])

prints

C = 0.66474879

If we compute the LOOCV by brute force and compare to the ALOOCV, we can see how accurate the approximation is

alt text

Fit Ridge Regression Hyperparameters

By expressing cross-validation as an optimization objective and computing derivatives, peak-engines is able to efficiently find regularization parameters that lead to the best score on a leave-one-out or generalized cross-validation. It, futhermore, scales to handle multiple regularizers. Here's an example of how it works

import numpy as np
from sklearn.datasets import load_boston
X, y = load_boston(return_X_y=True)
from peak_engines import RidgeRegressionModel
model = RidgeRegressionModel(normalize=True)
# Fit will automatically find the best alpha that minimizes the Leave-one-out Cross-validation.
# when you call fit. There's no need to provide a search space because peak_engines optimizes the
# LOOCV directly. It the computes derivatives of the LOOCV with respect to the hyperparameters and
# is able to quickly zero in on the best alpha.
model.fit(X, y)
print('alpha =', model.alpha_)

prints

alpha = 0.009274259071634289

Fit Warped Linear Regression

Let X and y denote the feature matrix and target vector of a regression dataset. Under the assumption of normally distributed errors, Ordinary Least Squares (OLS) finds the linear model that maximizes the likelihood of the dataset

What happens when errors aren't normally distributed? Well, the model will be misspecified and there's no reason to think its likelihood predictions will be accurate. This is where Warped Linear Regression can help. It introduces an extra step to OLS where it transforms the target vector using a malleable, monotonic function f parameterized by ψ and adjusts the parameters to maximize the likelihood of the transformed dataset

By introducing the additional transformation step, Warped Linear Regression is more general-purpose than OLS while still retaining the strong structure and interpretability. Here's how you use it

Load an example dataset

from sklearn.datasets import load_boston
X, y = load_boston(return_X_y=True)

Fit a warped linear regression model

import peak_engines
model = peak_engines.WarpedLinearRegressionModel()
model.fit(X_train, y_train)

Visualize the warping function

import numpy as np
import matplotlib.pyplot as plt
y_range = np.arange(np.min(y), np.max(y), 0.01)
z = model.warper_.compute_latent(y_range)
plt.plot(y_range, z)
plt.xlabel('Median Housing Value in $1000s')
plt.ylabel('Latent Variable')
plt.scatter(y, model.warper_.compute_latent(y))

alt text

Tutorials

Articles

Examples

Documentation

See doc/Reference.pdf

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

peak_engines-0.2.8-cp32-abi3-manylinux1_x86_64.whl (16.3 MB view details)

Uploaded CPython 3.2+

peak_engines-0.2.8-cp32-abi3-macosx_10_9_intel.whl (16.9 MB view details)

Uploaded CPython 3.2+ macOS 10.9+ intel

peak_engines-0.2.8-cp27-cp27mu-manylinux1_x86_64.whl (16.3 MB view details)

Uploaded CPython 2.7mu

peak_engines-0.2.8-cp27-cp27mu-macosx_10_9_intel.whl (16.9 MB view details)

Uploaded CPython 2.7mu macOS 10.9+ intel

peak_engines-0.2.8-cp27-cp27m-manylinux1_x86_64.whl (16.3 MB view details)

Uploaded CPython 2.7m

peak_engines-0.2.8-cp27-cp27m-macosx_10_9_intel.whl (16.9 MB view details)

Uploaded CPython 2.7m macOS 10.9+ intel

File details

Details for the file peak_engines-0.2.8-cp32-abi3-manylinux1_x86_64.whl.

File metadata

  • Download URL: peak_engines-0.2.8-cp32-abi3-manylinux1_x86_64.whl
  • Upload date:
  • Size: 16.3 MB
  • Tags: CPython 3.2+
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for peak_engines-0.2.8-cp32-abi3-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 dfc086ef678495bd98585c44439e1060f72a4f6464fd50e0d671f5d67d09494c
MD5 908997a76c33ed8a7b499ef076230d86
BLAKE2b-256 e6e041a6c8c9eb7188ece08dc7dc7b46e29000bf048ccbddb758ebcb41f62862

See more details on using hashes here.

File details

Details for the file peak_engines-0.2.8-cp32-abi3-macosx_10_9_intel.whl.

File metadata

  • Download URL: peak_engines-0.2.8-cp32-abi3-macosx_10_9_intel.whl
  • Upload date:
  • Size: 16.9 MB
  • Tags: CPython 3.2+, macOS 10.9+ intel
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for peak_engines-0.2.8-cp32-abi3-macosx_10_9_intel.whl
Algorithm Hash digest
SHA256 b0aa2ca6b228953dd73625002c54568f31f7da4fffca1efb6c8a7b728ae766f5
MD5 c7c08559de6c805a5f5672343ae09d06
BLAKE2b-256 21b3b28cfda3a336d6770da34465cda4ec132408e28d2d101da9c15500454457

See more details on using hashes here.

File details

Details for the file peak_engines-0.2.8-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

  • Download URL: peak_engines-0.2.8-cp27-cp27mu-manylinux1_x86_64.whl
  • Upload date:
  • Size: 16.3 MB
  • Tags: CPython 2.7mu
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for peak_engines-0.2.8-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 494cfb20e08d98a7c8ba836665c598811a7f5010a0955ce45c748086857bd901
MD5 e493ccd17fc0f44859eb27b552630da1
BLAKE2b-256 95a35af0985586d6e3b29ca26e637feb0375bcdc8f93936bd167d924ed2ef3e0

See more details on using hashes here.

File details

Details for the file peak_engines-0.2.8-cp27-cp27mu-macosx_10_9_intel.whl.

File metadata

  • Download URL: peak_engines-0.2.8-cp27-cp27mu-macosx_10_9_intel.whl
  • Upload date:
  • Size: 16.9 MB
  • Tags: CPython 2.7mu, macOS 10.9+ intel
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for peak_engines-0.2.8-cp27-cp27mu-macosx_10_9_intel.whl
Algorithm Hash digest
SHA256 5a72b2c4423c59ef6aedb5b62390d86347c82242b6ece047180cdf45dfa8aa3f
MD5 3ff8b0d0f654e06572d987f964913378
BLAKE2b-256 9deb675a5eb22a2d6a52d7adc8578e5aea7878bd32d34158c72f2f8a056d468f

See more details on using hashes here.

File details

Details for the file peak_engines-0.2.8-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

  • Download URL: peak_engines-0.2.8-cp27-cp27m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 16.3 MB
  • Tags: CPython 2.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for peak_engines-0.2.8-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 b0a6d39dbed3b2f5aea97811d70be94ed9013a5610121c39535a730da8d9254e
MD5 2b277454d89615d50169a84f77a04473
BLAKE2b-256 7a5179f7c79b4a99c28b525a8f498fa6a51325a0c673e773ac4a5183d0ac7e0d

See more details on using hashes here.

File details

Details for the file peak_engines-0.2.8-cp27-cp27m-macosx_10_9_intel.whl.

File metadata

  • Download URL: peak_engines-0.2.8-cp27-cp27m-macosx_10_9_intel.whl
  • Upload date:
  • Size: 16.9 MB
  • Tags: CPython 2.7m, macOS 10.9+ intel
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for peak_engines-0.2.8-cp27-cp27m-macosx_10_9_intel.whl
Algorithm Hash digest
SHA256 79600d7242d574cd46f4bbc051f4b7ff67c1da38230d9722e25d8758e6a967d3
MD5 b341b65d0f4f6ec867a6b8d46764908c
BLAKE2b-256 57dfc7afeb593097fb4f17c11fcb63d9dadb067726a1b91c437683970c5080cf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page