Skip to main content

Classifier based non-parametric change point detection

Project description

Random Forests for Change Point Detection

Change point detection aims to identify structural breaks in the probability distribution of a time series. Existing methods either assume a parametric model for within-segment distributions or are based on ranks or distances and thus fail in scenarios with a reasonably large dimensionality.

changeforest implements a classifier-based algorithm that consistently estimates change points without any parametric assumptions, even in high-dimensional scenarios. It uses the out-of-bag probability predictions of a random forest to construct a pseudo-log-likelihood that gets optimized using a computationally feasible two-step method.

See [1] for details.

Installation

To install from conda-forge (recommended), run

conda install -c conda-forge changeforest

To install from PyPI, run

pip install changeforest

Example

In the following example, we perform random forest-based change point detection on a simulated dataset with n=600 observations and covariance shifts at t=200, 400.

In [1]: import numpy as np
   ...: 
   ...: Sigma = np.full((5, 5), 0.7)
   ...: np.fill_diagonal(Sigma, 1)
   ...: 
   ...: rng = np.random.default_rng(12)
   ...: X = np.concatenate(
   ...:     (
   ...:         rng.normal(0, 1, (200, 5)),
   ...:         rng.multivariate_normal(np.zeros(5), Sigma, 200),
   ...:         rng.normal(0, 1, (200, 5)),
   ...:     ),
   ...:     axis=0,
   ...: )

The simulated dataset X coincides with the change in covariance (CIC) setup described in [1]. Observations in the first and last segment are independently drawn from a standard multivariate Gaussian distribution. Observations in the second segment are i.i.d. normal with mean zero and unit variance, but with a covariance of ρ = 0.7 between coordinates. This is a challenging scenario.

In [2]: from changeforest import changeforest
   ...: 
   ...: result = changeforest(X, "random_forest", "bs")
   ...: result
Out[2]: 
                    best_split max_gain p_value
(0, 600]                   412   19.603   0.005
 ¦--(0, 412]               201   62.981   0.005
 ¦   ¦--(0, 201]           194  -12.951    0.76
 ¦   °--(201, 412]         211   -9.211   0.545
 °--(412, 600]             418  -37.519   0.915

In [3]: result.split_points()
Out[3]: [201, 412]

changeforest correctly identifies the change point around t=200 but is slightly off at t=412. The changeforest function returns a BinarySegmentationResult. We use its plot method to investigate the gain curves maximized by the change point estimates:

result.plot().show()

Change point estimates are marked in red.

For method="random_forest" (and method="knn"), the changeforest algorithm uses a two-step approach to find an optimizer of the gain. This fits a classifier for three split candidates at the segment's 1/4, 1/2 and 3/4 quantiles, computes approximate gain curves using the resulting pseudo-log-likelihoods and selects the overall optimizer as a second guess. We can investigate the gain curves from the optimizer using the plot method of OptimizerResult. The initial guesses are marked in blue.

result.optimizer_result.plot().show()

One can observe that the approximate gain curves are piecewise linear, with maxima around the true underlying change points.

The BinarySegmentationResult returned by changeforest is a tree-like object with attributes start, stop, best_split, max_gain, p_value, is_significant, optimizer_result, model_selection_result, left, right and segments. These can be interesting to investigate the output of the algorithm further.

The changeforest algorithm can be tuned with hyperparameters. See here for their descriptions and default values. In Python, the parameters can be specified with the Control class, which can be passed to changeforest. The following will build random forests with 20 trees:

In [6]: from changeforest import Control
   ...: changeforest(X, "random_forest", "bs", Control(random_forest_n_estimators=20))
Out[6]: 
                            best_split max_gain p_value
(0, 600]                           592  -11.786    0.01
 ¦--(0, 592]                       121    -6.26   0.015
 ¦   ¦--(0, 121]                    13  -14.219   0.615
 ¦   °--(121, 592]                 416   21.272   0.005
 ¦       ¦--(121, 416]             201   37.157   0.005
 ¦       ¦   ¦--(121, 201]         192   -17.54    0.65
 ¦       ¦   °--(201, 416]         207   -6.701    0.74
 ¦       °--(416, 592]             584  -44.054   0.935
 °--(592, 600]     

The changeforest algorithm still detects change points around t=200, 400 but also returns two false positives.

Due to the nature of the change, method="change_in_mean" is unable to detect any change points at all:

In [7]: changeforest(X, "change_in_mean", "bs")
Out[7]: 
          best_split max_gain p_value
(0, 600]         589    8.318 

References

[1] M. Londschien, S. Kovács and P. Bühlmann (2022), "Random Forests for Change Point Detection", working paper.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

changeforest-0.7.0.tar.gz (381.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

changeforest-0.7.0-cp310-none-win_amd64.whl (302.4 kB view details)

Uploaded CPython 3.10Windows x86-64

changeforest-0.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

changeforest-0.7.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ ARM64

changeforest-0.7.0-cp310-cp310-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl (801.5 kB view details)

Uploaded CPython 3.10macOS 10.9+ universal2 (ARM64, x86-64)macOS 10.9+ x86-64macOS 11.0+ ARM64

changeforest-0.7.0-cp310-cp310-macosx_10_7_x86_64.whl (414.4 kB view details)

Uploaded CPython 3.10macOS 10.7+ x86-64

changeforest-0.7.0-cp39-none-win_amd64.whl (302.8 kB view details)

Uploaded CPython 3.9Windows x86-64

changeforest-0.7.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

changeforest-0.7.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.2 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ ARM64

changeforest-0.7.0-cp39-cp39-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl (801.4 kB view details)

Uploaded CPython 3.9macOS 10.9+ universal2 (ARM64, x86-64)macOS 10.9+ x86-64macOS 11.0+ ARM64

changeforest-0.7.0-cp39-cp39-macosx_10_7_x86_64.whl (414.4 kB view details)

Uploaded CPython 3.9macOS 10.7+ x86-64

changeforest-0.7.0-cp38-none-win_amd64.whl (302.8 kB view details)

Uploaded CPython 3.8Windows x86-64

changeforest-0.7.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

changeforest-0.7.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.2 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ ARM64

changeforest-0.7.0-cp38-cp38-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl (801.8 kB view details)

Uploaded CPython 3.8macOS 10.9+ universal2 (ARM64, x86-64)macOS 10.9+ x86-64macOS 11.0+ ARM64

changeforest-0.7.0-cp38-cp38-macosx_10_7_x86_64.whl (414.6 kB view details)

Uploaded CPython 3.8macOS 10.7+ x86-64

changeforest-0.7.0-cp37-none-win_amd64.whl (302.6 kB view details)

Uploaded CPython 3.7Windows x86-64

changeforest-0.7.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.7mmanylinux: glibc 2.17+ x86-64

changeforest-0.7.0-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.2 MB view details)

Uploaded CPython 3.7mmanylinux: glibc 2.17+ ARM64

changeforest-0.7.0-cp37-cp37m-macosx_10_7_x86_64.whl (414.6 kB view details)

Uploaded CPython 3.7mmacOS 10.7+ x86-64

File details

Details for the file changeforest-0.7.0.tar.gz.

File metadata

  • Download URL: changeforest-0.7.0.tar.gz
  • Upload date:
  • Size: 381.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for changeforest-0.7.0.tar.gz
Algorithm Hash digest
SHA256 de5eef1b777361a22449f02c0f4e75e519a4f09048a5f4f3315560077283ca8d
MD5 d543abe320290146e43950053117aee3
BLAKE2b-256 5dac0799d4eefc87ea64774584dab6800cf8bea921a08ed68cc1a0b599666024

See more details on using hashes here.

File details

Details for the file changeforest-0.7.0-cp310-none-win_amd64.whl.

File metadata

File hashes

Hashes for changeforest-0.7.0-cp310-none-win_amd64.whl
Algorithm Hash digest
SHA256 011d27d2d190df1544d38301fd063f8baecb5e1575c4a60914bf6e28b0a213e5
MD5 5c13a7b870a7836cb4ee4dfc0771b050
BLAKE2b-256 002f55e7164c8eaec1524adfbc13dc23d9566d93174f62bb74f5bc3242177e82

See more details on using hashes here.

File details

Details for the file changeforest-0.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for changeforest-0.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e098ac562c1b32f1b5c6a7276ec935689b42e9e7dc75ab371fc4b3051e328061
MD5 cebace477bc3e82068951acecf1e35cd
BLAKE2b-256 0888593d0270590418b345491e7b8fb9baeb7fe2932bba021a2e678520ba427d

See more details on using hashes here.

File details

Details for the file changeforest-0.7.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for changeforest-0.7.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 636ddadd8abe826ec21740a1f3ad652759e073bb67fc1771715dd25114a55ddb
MD5 e182b9d93ec54d8fe82aaba60a508436
BLAKE2b-256 e7c6fe5db5b4fb222649a8034b6286105ccdfd16841e5a4e0097ee7998cb4f80

See more details on using hashes here.

File details

Details for the file changeforest-0.7.0-cp310-cp310-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for changeforest-0.7.0-cp310-cp310-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 e35ccc5f900e2c3705fafda6254430f155f74ce3dec5a8948440ac1f48233f09
MD5 3df63ac6681a1b41f02bb5990840bd0f
BLAKE2b-256 1dfb2d4f892bff9062e65fa85590a223bfa7d90dbcd92a92bb0745e13362b62e

See more details on using hashes here.

File details

Details for the file changeforest-0.7.0-cp310-cp310-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for changeforest-0.7.0-cp310-cp310-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 d6f7b45fda2b07447fb166455b29b4b0663fe29a541b536d7d7c79b48cd82f7c
MD5 f78324f70b17023538da95c49043bd22
BLAKE2b-256 a77aefcdcc9922a8580c6a8b40a2fdee942721990ec7b17b96d59f5246de126b

See more details on using hashes here.

File details

Details for the file changeforest-0.7.0-cp39-none-win_amd64.whl.

File metadata

  • Download URL: changeforest-0.7.0-cp39-none-win_amd64.whl
  • Upload date:
  • Size: 302.8 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for changeforest-0.7.0-cp39-none-win_amd64.whl
Algorithm Hash digest
SHA256 3847c7fe38b442108accfe5dad4a8456b77898768b2ec11aa5e6470a7b9475a2
MD5 923e9d8f9f93bda66351d482dee3adbd
BLAKE2b-256 c0840c6c084aaa128b51ebd220496a8892719ed69d23cafa1dfbfe665f085a2b

See more details on using hashes here.

File details

Details for the file changeforest-0.7.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for changeforest-0.7.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3d6225f89ef9b5a2448af52dc50231a702a785dbdd8dde4cebb9df4a694659da
MD5 c531291139c7ea4f1f6b2f4d4a53cc84
BLAKE2b-256 70cf772612b41411b3fb6ccde931a0c62dfee95482ab257c3c7b6eb7f93bfe94

See more details on using hashes here.

File details

Details for the file changeforest-0.7.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for changeforest-0.7.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 9d796fba4c7b944c87cbaa2c77189f749ec38aaf2bb41b5ef2e2ff8da5fdd102
MD5 d8b40d41b851cb89a8c66ae065c9f395
BLAKE2b-256 0ea18d1a00594177255ab32fc6f976d1921350d754cd8f9d39d182a0fb3f7c7f

See more details on using hashes here.

File details

Details for the file changeforest-0.7.0-cp39-cp39-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for changeforest-0.7.0-cp39-cp39-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 39b0a4bfd3b63dc454a063ab9014e811c9f9629daff68c7b66f7cd61c71ba300
MD5 fb705fb76a06e31409ad69be19315a66
BLAKE2b-256 712cd8feb8408eed602be4740ed18878bd002517d89fc9a4d5a25ea56a83c62a

See more details on using hashes here.

File details

Details for the file changeforest-0.7.0-cp39-cp39-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for changeforest-0.7.0-cp39-cp39-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 072e72e38728514e6e7f64a6a40f32489dc7808c1ea8bd63d6d4e53e89183603
MD5 9fd61c6014eefe14ec97681658dc387f
BLAKE2b-256 0e07bc4bae64894dc1982b126d41edc9e700426febd49588b8107bf0d8041683

See more details on using hashes here.

File details

Details for the file changeforest-0.7.0-cp38-none-win_amd64.whl.

File metadata

  • Download URL: changeforest-0.7.0-cp38-none-win_amd64.whl
  • Upload date:
  • Size: 302.8 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for changeforest-0.7.0-cp38-none-win_amd64.whl
Algorithm Hash digest
SHA256 4374b86de245a19f624343c6c167d402ba0f9a4c029bd7e009084b8e5962fc79
MD5 68f315eb8ac25016f3b8aac3a4ea6d03
BLAKE2b-256 157cef1b66986fe34eeaf124d0e9759b81f2f7f532b32baa8ee8e5f11d875108

See more details on using hashes here.

File details

Details for the file changeforest-0.7.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for changeforest-0.7.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 af8a7ae82495293847a32f9d7dc76eb79749c3b32f690e1f684228b4d43f922f
MD5 eb18dcebf9456888b61a13d9a9524a9c
BLAKE2b-256 0aedac88dd440c19a316a7e0d47824cea716d68f013838b41cd81c062833dca8

See more details on using hashes here.

File details

Details for the file changeforest-0.7.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for changeforest-0.7.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 3b0198f0d990b8ff94b334ff700d5c1b47b143ee421c7c5c47d19d0dc28adb51
MD5 59589243f1083e3bb3402b660b0de3ed
BLAKE2b-256 71738a918cd4bd4d7b27438c84f2e1477f7fb7eb8f9caa7620d22926211cbf75

See more details on using hashes here.

File details

Details for the file changeforest-0.7.0-cp38-cp38-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for changeforest-0.7.0-cp38-cp38-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 26474f2d9c8a8a309e108d45bfa8d3c804a66b909d8a92938a3590ac33617293
MD5 dc2844f5641654d1d9e9c1c01170f6b6
BLAKE2b-256 fd6db23ff920466e1147030d8eed98d98220d2e235d98a664fa83348ddf5f5e0

See more details on using hashes here.

File details

Details for the file changeforest-0.7.0-cp38-cp38-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for changeforest-0.7.0-cp38-cp38-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 a9ae2d7d6bf7b1e6b0a100e1782da5b85819dd82aac9ef00d553098eec4b4a0f
MD5 a6a1aaed21fd4787621466e022516fc2
BLAKE2b-256 a919f78e8113a1a1482fc0235cf26c40fccad3602ba5a227f1149dfbc1c5ad17

See more details on using hashes here.

File details

Details for the file changeforest-0.7.0-cp37-none-win_amd64.whl.

File metadata

  • Download URL: changeforest-0.7.0-cp37-none-win_amd64.whl
  • Upload date:
  • Size: 302.6 kB
  • Tags: CPython 3.7, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for changeforest-0.7.0-cp37-none-win_amd64.whl
Algorithm Hash digest
SHA256 7678cc24346d02bba4393914bc967cc866ef1cafc00fcee3ac2b826598d9b5d2
MD5 56e1f33b0912b37d023aa29f2060c1f9
BLAKE2b-256 be9436a848f233f99eabecbb27a6126c60b589d4165487c9415129fd9eb6775f

See more details on using hashes here.

File details

Details for the file changeforest-0.7.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for changeforest-0.7.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7fcb97b91221747cd0b9e30bb3303a744f23a74c39e95388acd06001f597c2de
MD5 5fa71c7c388a87727a222c9519b4bf67
BLAKE2b-256 c883eafbd77bb7bde7d518d395a6d44911b52d412dc49af3b28b63411610b2e6

See more details on using hashes here.

File details

Details for the file changeforest-0.7.0-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for changeforest-0.7.0-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 0e519c00b68a6ec92a6d1a30a0ae9db5b33f83129e89702603d268b7c91fc007
MD5 5f92aa64d8dd27edea95f1c8fc9e4197
BLAKE2b-256 a59a0794a247644c64b7eaec1bc7263ca58348e5b3fad123e67ef4d764023017

See more details on using hashes here.

File details

Details for the file changeforest-0.7.0-cp37-cp37m-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for changeforest-0.7.0-cp37-cp37m-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 336ff6fca9f54dcabc49690a97bf609a031813328be2391ab43e7f3c12c28c58
MD5 0243ef123a06ec9bddfea47f952840b6
BLAKE2b-256 31b83123c68a1221374ec3aae07dea3006c51bb215c35cacb3c8b49e0f37dfa2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page