Dividing data into linear segments.
Project description
nunchaku: Dividing data into linear segments
nunchaku
is a Python module for dividing data into linear segments.
It answers two questions:
- how many linear segments best fit the data without overfitting (by Bayesian model comparison);
- given the number of linear segments, where the boundaries between them are (by finding the posterior of the boundaries).
Installation
For users, type in terminal
> pip install nunchaku
For developers, create a virtual environment and then install with Poetry:
> git clone https://git.ecdf.ed.ac.uk/s1856140/nunchaku.git
> cd nunchaku
> poetry install --with dev
Quickstart
Data x
is a list or a 1D Numpy array, sorted ascendingly; the data y
is a list or a Numpy array, with each row being one replicate of measurement.
>>> from nunchaku import Nunchaku, get_example_data
>>> x, y = get_example_data()
>>> nc = nunchaku(x, y, prior=[-5,5]) # load data and set prior of gradient
>>> # compare models with 1, 2, 3 and 4 linear segments
>>> numseg, evidences = nc.get_number(max_num=4)
>>> # get the mean and standard deviation of the boundary points
>>> bds, bds_std = nc.get_iboundaries(numseg)
>>> # get the information of all segments
>>> info_df = nc.get_info(bds)
>>> # plot the data and the segments
>>> nc.plot(info_df)
Documentation
Detailed documentation is available on Readthedocs.
Citation
A preprint is coming soon.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
nunchaku-0.11.0.tar.gz
(13.6 kB
view hashes)
Built Distribution
nunchaku-0.11.0-py3-none-any.whl
(15.4 kB
view hashes)
Close
Hashes for nunchaku-0.11.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b03e5e9e646636000bf5d5eab2ab1d0ad9dc42a702b6b6ca4b523df2fb13e250 |
|
MD5 | 32f19ae578e126c0fc203f101847d279 |
|
BLAKE2b-256 | 95c44d425b7539a4598dbce9154e53e3ab2e829f82fef47bda8ab86b82521b4e |