Skip to main content

Model variance with multiplicative variance trees

Project description

varvar

Python package to model variance in different ways

Multiplicative variance trees and the varvar algorithm

varvar is a greedy algorithm for multiplicative variance trees.

varvar is to variance as lightgbm/xgboost/... are to expectation.

There are currently two implementations of varvar algorithms:

  1. using quantile search at every split (in varvar.qtrees)
  2. using histograms, with binning before starting (in varvar.htrees)

Quantile search is much slower, but can be more accurate.

This is similar to the "exact" and "hist" modes in xgboost, except our "exact" algorithm goes over a small (exact) subset of each feature.

Both implementation modules have a multiplicative_variance_trees function.

Use treespredict.predict for prediction.

The trees are returned as plain python types and can be serialized with pickle or even as json.

Here is an example:

from varvar.htrees import multiplicative_variance_trees  # takes time because numba compiles functions
from varvar.treespredict import predict
import numpy as np

random = np.random.RandomState(1729)
n = 200000
x = random.uniform(-1000, 1000, n)
correct_threshold = 300
sigma = 1 * (x <= correct_threshold) + 30 * (x > correct_threshold)
e = sigma * random.randn(n)

trees = multiplicative_variance_trees(
    [x], e**2,
    num_trees=1, max_depth=1, mingain=1, learning_rate=1,
    q=np.linspace(0, 1, 100)[1:-1]
)
preds = predict(trees, [x])

found_threshold = trees[1][0][1]
print(correct_threshold, found_threshold)  # 300, 295
print(np.sqrt(min(preds)), np.sqrt(max(preds)))  # 1, 30

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

varvar-1.0.0.tar.gz (13.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

varvar-1.0.0-py3-none-any.whl (15.9 kB view details)

Uploaded Python 3

File details

Details for the file varvar-1.0.0.tar.gz.

File metadata

  • Download URL: varvar-1.0.0.tar.gz
  • Upload date:
  • Size: 13.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/33.0 requests/2.26.0 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.8.1 keyring/23.5.0 rfc3986/1.5.0 colorama/0.4.4 CPython/3.8.10

File hashes

Hashes for varvar-1.0.0.tar.gz
Algorithm Hash digest
SHA256 0654aeaed1132594c506c73ce942b55b12f04e994d676509e6c046ef59eb0ab7
MD5 3c1362f7d4d646f39e2418726ba612c1
BLAKE2b-256 1b6100d7f1f2ba19e69517ccd3470141818772dd79eeb2f2ee92b42d69fc4989

See more details on using hashes here.

File details

Details for the file varvar-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: varvar-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 15.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/33.0 requests/2.26.0 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.8.1 keyring/23.5.0 rfc3986/1.5.0 colorama/0.4.4 CPython/3.8.10

File hashes

Hashes for varvar-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 23f4930d3698f69f6c202506f790928b763e3572f72d5c4b2fa6f0ef2872f5f6
MD5 ecb229f30753155d57d97b269a58c53a
BLAKE2b-256 2a9f8d38a5c2627202ce1ff98d8b491d1bf7d10daa827a88255c681f8d2da9b9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page