Skip to main content

A cython version of the changepoint R package

Project description

changepoint_cython

A cython version of the changepoint and changepoint.np R package, see https://github.com/cran/changepoint and https://github.com/cran/changepoint.np.

Implemented algorithms:

  • Binary Segmentation, PELT, and Segmentation Neigborhood: monovariate and multivariate time series.
  • Non parametric PELT: monovariate variate.

Model implemented for the monovariate case :

  • Normal (mean, variance, or mean and variance).
  • Poisson (mean and variance).
  • Exponential (mean and variance).

Model implemented for the multivariate case :

  • Normal (mean).

Working on Windows/Linux and python3.x

Install

PyPI

Source available on https://pypi.org/project/changepoint-cython/

pip install changepoint-cython

From source

Download and run:

pip3 install -r requirements.txt
pip3 install .

Tests

Synthetic examples

python3 test_segmentation.py

python3 test_segmentation_multiple.py

Test on well bore data

See notebook "segmentation example on well log data.ipynb"

import pandas as pd
import numpy as np
import lasio
from pychangepoints import cython_pelt, algo_changepoints

Loading and cleaning data

well_log_las = lasio.read('./data/2120913D.las')
well_log_df = well_log_las.df()

list_var=['GR','NPOR','RHOZ','PEFZ']
well_log_data = well_log_df[list_var]

well_log_data = well_log_data[well_log_data>-1]
well_log_data.dropna(inplace=True)
well_log_data = well_log_data[well_log_data.index>6542]

Segmentation with PELT for multivariate time series.

penalty_value = 5
min_segment_size = 20
model = 'mbic_mean'
list_cpts, nb_cpts = algo_changepoints.pelt_multiple(well_log_data,penalty_value, min_segment_size, model)

Results of the segmentation

Documentation

PELT

Inputs:

  • data: data as pandas dataframe,
  • penalty_value: double, penalty value, (BIC penalty),
  • min_segment_size: int, minimum segment size,
  • model: str, statistical model (see cost_function.pxd).

Outputs:

  • list_cpts: array of the postion of the change-points,
  • nb_cpts: number of chnagepoint.

Usage:

from pychangepoints import algo_changepoints

penalty_value = 5
min_segment_size = 20
model = 'mbic_mean'
list_cpts, nb_cpts = algo_changepoints.pelt(data, penalty_value, min_segment_size, model)

For the multivariate case, call pelt_multiple:

from pychangepoints import algo_changepoints

penalty_value = 5
min_segment_size = 20
model = 'mbic_mean'
list_cpts, nb_cpts = algo_changepoints.pelt_multiple(data, penalty_value, min_segment_size, model)

NP PELT

Inputs:

  • data: as pandas dataframe,
  • penalty_value: double, penalty value, (BIC penalty),
  • min_segment_size: int, minimum segment size,
  • nquantiles: int, number of quantiles,
  • method: str, cost functioon (optional, only one implemented).

Outputs:

  • list_cpts, array of the postion of the change-points
  • nb_cpts, number of chnagepoint.

Usage:

from pychangepoints import algo_changepoints

penalty_value = 5
min_segment_size = 20
model = 'mbic_nonparametric_ed'
nquantiles = 10
list_cpts, nb_cpts = algo_changepoints.np_pelt(data, penalty_value, min_segment_size, nquantiles, method = model)

Binary Segmentation

Inputs:

  • data: pandas dataframe,
  • Q: int, number of changepoints,
  • min_segment_size: int, minimum segment size,
  • statistical model, str (see cost_function.pxd).

Output:

  • array of the postion of the change-points,

Usage:

from pychangepoints import algo_changepoints
Q = 5
min_segment_size = 20
model = 'mbic_mean'
list_cpts = algo_changepoints.binseg(data, Q, minseg, method)

For the multivariate case, call binseg_multiple:

from pychangepoints import algo_changepoints
Q = 5
min_segment_size = 20
model = 'mbic_mean'
list_cpts = algo_changepoints.binseg_multiple(data, Q, minseg, method)

Segmentation neighborhood

Inputs:

  • data: pandas dataframe,
  • Q: int, number of changepoints,
  • statistical model, str (see cost_function.pxd).

Output:

  • array of the postion of the change-points,

Usage:

from pychangepoints import algo_changepoints
Q = 5
model = 'mbic_mean'
list_cpts = algo_changepoints.segneigh(data, Q, method)

For the multivariate case, call segneigh_multiple:

from pychangepoints import algo_changepoints
Q = 5
model = 'mbic_mean'
list_cpts = algo_changepoints.segneigh_multiple(data, Q, method)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

changepoint_cython-0.1.5.tar.gz (66.1 kB view details)

Uploaded Source

Built Distribution

changepoint_cython-0.1.5-cp39-cp39-manylinux_2_35_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.35+ x86-64

File details

Details for the file changepoint_cython-0.1.5.tar.gz.

File metadata

  • Download URL: changepoint_cython-0.1.5.tar.gz
  • Upload date:
  • Size: 66.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.0 CPython/3.9.16 Linux/5.15.0-1037-azure

File hashes

Hashes for changepoint_cython-0.1.5.tar.gz
Algorithm Hash digest
SHA256 3437c73e04dd8803984082f2702bbaae92a7cd22ba945e20da118148b49af025
MD5 657040bba2f05135b5892814cfe62987
BLAKE2b-256 67205f8b01bbcd2e79ee87319fe48ad1658e103502d3ee26cbdfd0deb908ff64

See more details on using hashes here.

File details

Details for the file changepoint_cython-0.1.5-cp39-cp39-manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for changepoint_cython-0.1.5-cp39-cp39-manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 6233d2cf6b4d039afa2fd1a386d90067314d5a0776f8e39d65ef2836ed0d741c
MD5 eae8a9a8c3058e741c586c881f338ba0
BLAKE2b-256 4be9f4a5b286181c81d419d39c371bdd8d9c2998ede7275a23be5bbf560acc29

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page