Skip to main content

Utilities to de-noise time series from random telegraph noise / peak splitting artefacts

Project description

py_peak_splitting

Description

This package contains an implementation for the random telegraph noise (RTN) removal tools as outlined in:

Multi-level RTN removal tools for dynamic FBG strain measurements corrupted by peak splitting artefacts

The aim of this project is to find a pragmatic but effective way to remove jumps which offset the signal over a certain amount of time, by a certain amplitude. The article presents two methods which both follow the same principle:

The problem of denoising a signal containing jumps is translated into detecting and replacing outliers in the corresponding difference signal; once the outlier detection and replacement has been done, the cumulative sum of the de-noised difference signal is computed to arrive back at a de-noised version of the original signal.

A typical case of measurement data containing jumps can be seen in Figure 1, which shows a corrupted signal before (black) and after noise removal (blue).

Figure 1. Left: Raw signal before (black) and after removing RTN/peak-splitting artefacts (blue). Right: Zoom-in on smaller data subset, as well as corresponding difference samples with annotations for nominal samples (blue), outliers (red) and outlier replacements (green). Note that mean value of the de-noised signal has not been set to value which represents a realistic mean value of the raw signal.

For more detailed information on the two algorithms contained in this package, it is referred the accompanying publication, which can be read here

Background and motivation

Strain measurements using Fiber Bragg (FBG) optical sensors are becoming more and more popular. However, in some cases these measurements can become corrupted by sudden jumps in the signal, which offset the signal over the course of e.g. several seconds. These jumps are caused by a defect in the FBG itself which is referred to as peak-splitting. The effects of peak-splitting artefacts on FBG strain measurements bear similarities with an additive multi-level telegraph noise process, in which the amplitudes and occurrences of the jumps are related to fiber deformation states.

Jumps in measurement data, as shown in Figure 2, can severely limit the quality or usability of data. Figure 2 shows two raw signals with different levels of jump amplitudes relative to nominal difference between subsequent samples; the blue annotations indicate jumps which can be clearly separated from nominal samples, whereas the red annotations illustrate the case in which jump levels are not as straight forward to detect. The considered noise type cannot be removed by classical linear filters (e.g. a n'th order Butterworth filter). The presence of jumps in data motivates the need for a pre-processing tool which allows to effectively remove RTN noise.

Figure 2. Multi level telegraph noise / peak splitting artefacts. Left: two raw signals containing artefacts of different severity. Upper right: large jump amplitudes in comparison with differences between subsequent signal samples. Lower right: jump amplitudes in same order of magnitude as difference samples.

The principal idea presented in the accompanying publication is to translate the de-noising of the raw signal containing jumps, into the problem of de-noising a sample wise difference signal containing outliers. The latter is shown Figure 3, which also shows the corresponding histograms hinting at the distribution of the nominal and outlying difference samples. The blue and red annotations clearly indicate the outlying difference samples corresponding to relative large and small jumps respectively.

Figure 3. Sample wise difference signals corresponding to signals presented in Fig. 2. The annotations high light outlying difference samples which are attributed to RTN / peak splitting artefacts.

In order to de-noise the difference signal, an outlier detection scheme will be used to label difference samples corresponding to a jump in the data as an outlier. After the outlier detection step, all outliers will be replaced by estimates. Once the de-noising process is completed, the cumulative sum is used to obtain a realization of the de-noised raw signal. If needed, the reconstructed signal can be corrected for a drift as well as a mean value, by maximizing the overlap between the raw and reconstructed signals.

As mentioned, the accompanying article presents two methods which follow the outlined principle. Although sharing the underlying principle, they are different in the way they try to discriminate between nominal samples and outliers, as well as in sample subset selection for the replacement of individual outliers. The first method is based on using a threshold filter for outlier detection; the high level working principle of the method is described as follows:

  • Compute the sample wise difference signal
  • Derive an outlier threshold level from the histogram of difference samples
  • Replace outliers in difference signal using a linear regression on an outlier specific buffer (which does not contain outliers).
  • Take the cumulative sum
  • If necessary, apply drift and mean correction

As an alternative to defining a single threshold based on the full signal, the second methods is aimed at defining threshold levels on shorter segments of the signal. The outlier detection step is based on a Hampel filter. The high-level working principle is outlined as follows:

  • Compute the sample wise difference signal
  • Segment the data into short segments and compute segment wise threshold levels based on a scale estimate for the segment's standard deviation.
  • Perform outlier detection on individual segments
  • Replace outliers in segments using a linear regression (after removing outliers from segment).
  • Take the cumulative sum
  • If necessary, apply drift and mean correction

Basic use of both methods is outlined in section 1.4. A more elaborate demonstration can be found in the jupyter notebooks contained in '''./py_peak_splitting/demo/ ''' or using the Binder which can be found here.

Installation requirements

py_peak_splitting is written for Python 3.6+. Downloading Anaconda3 is recommended for users not familiar with Python development. The numpy, scipy and matplotlib package are required to use all functionalities of the threshold filter based reconstruction method, which includes an option to visualise threshold levels and overlay a kernel density estimate. The Hampel based reconstruction method only requires the numpy pacakage.

The py_peak_splitting package can be installed using one of the methods listed below:

Install using Pip (Recommended)

  • Open a terminal or command line interface (or an anaconda prompt, if you're using anaconda)
  • run: pip install py_peak_splitting
  • The download and install should start

Manual install

  • Download or clone the package to a desired location on your machine
  • Extract the zip file at that particular location
  • Open a terminal or command line interface (or an anaconda prompt, if you are using anaconda)
  • Set the current directory of the terminal/command line interface to the package's root: PATH_TO_PACKAGE_LOCATION/py_peak_splitting
  • run: python install .setup.py
  • The installation should start

Basic examples

This section contains the most basic usage examples. More elaborate examples are presented in a Binder which can be found here. For this section a dummy variable, a, holding a 1-dimensional numpy array with corrupted measurements will be used:

type(a) 
>> numpy.ndarray

a.shape
>> (60000,)

Basic example Hampel filter based reconstruction methods:

from py_peak_splitting.hampel_filter import Hampel as Ha

h = Ha(a, th='sig', th_val= 3, re='plf', nperseg=50)
rec = h.reconstruction         
clu = h.clusters
lab = h.labels

Basic example Threshold filter based reconstruction methods:

from py_peak_splitting.th_filter import ThFilter as Th

th = Th(arr=a)  
thv = th.estimate_th(aggressive=True, th=500, plot=False)
rec, clu, lab = th.recon_th(th=thv, method='polyfit', nbuffer=25)
rec = rec.flatten()

License

py_peak_splitting is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Please check the license terms below and in the license file.

Creative Commons License

Disclaimer

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py_peak_splitting-0.0.2.tar.gz (25.3 kB view details)

Uploaded Source

Built Distribution

py_peak_splitting-0.0.2-py3-none-any.whl (23.6 kB view details)

Uploaded Python 3

File details

Details for the file py_peak_splitting-0.0.2.tar.gz.

File metadata

  • Download URL: py_peak_splitting-0.0.2.tar.gz
  • Upload date:
  • Size: 25.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.9.6

File hashes

Hashes for py_peak_splitting-0.0.2.tar.gz
Algorithm Hash digest
SHA256 42884075547898420f4f380e8a1500430c89bf2cc862e7a686d8fd993fab48b0
MD5 5848faeb56b185862eaee479f293924c
BLAKE2b-256 e20286195557d27db7c5a20fdf313a14a1b04e007135cdb0c2b116620b600c4f

See more details on using hashes here.

File details

Details for the file py_peak_splitting-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: py_peak_splitting-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 23.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.9.6

File hashes

Hashes for py_peak_splitting-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 fb5731539ba0bf47e05a137f209a2bee5eb9ebd556aaf15a53dbcf833abd3e80
MD5 2293be105f61c56e5536b0cbf67b8224
BLAKE2b-256 1b8d7a2024a685f5e048c27067da0ae1cd1ab40860e0bda0f2860784ea8565be

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page