Skip to main content

maximum entropy spectral analysis

Project description

Authors Alessandro Martini, Stefano Schmidt, Walter del Pozzo

emails martini.alessandr@gmail.com, stefanoschmidt1995@gmail.com, walter.delpozzo@ligo.org

Copyright Copyright (C) 2024 Alessandro Martini

Licence CC BY 4.0

Version 1.3.0

MAXIMUM ENTROPY SPECTRAL ANALYSIS FOR ACCURATE PSD COMPUTATION

memspectrum is a package for the computation of power spectral densitiy (PSD) of time series. It implements a fast numpy verion of the Burg method for Maximum Entropy Spectral Analysis. The method is fast and reliable and shows better performance than other standard methods.

The method is based on the maximum entropy principle, and it allows to make minimal assumptions on unavailable information. Furthermore, it provides a beautiful link between spectral analysis and the theory of autoregressive processes.

The PSD is expressed in terms of a set of coefficients a_k plus an overall scale factor P. The a_ks are obtained recursively through the Levinson recursion. The knowledge of such coefficients allows to characterize the observed time series in terms of an autoregressive process of order p (AR(p)), being p + 1 the lenght of the a_k array. The a_k coefficients are the autoregressive coefficients, while the P scale factor can be interpreted as the variance of white noise component for the process. The a_k coefficients are found to be the "best linear predictor" for the time series under study, their computation via the former method is equivalent to a least square fitting with an autoregressive process of order p (AR(p)). They are computed with an algorithm called Levinson recursion, which also return a "P" coefficient that is equivalent to the variance of the white noise component for the process. The description is stationary by construction. Once the link with an AR(p) process is established, high quality forecast for the time series is straightforward.

Installation & documentation

To install the package:

pip install memspectrum

It requires numpy and scipy.

Useful links:

On this repository, you can find a number of examples:

  • gwstrain.py: computes the PSD on a piece of gravitational waves data and perform some forecasting
  • sunspots.py: using data from sunspots, it uses memspectrum to find an autoregressive process which describes them and forecast
  • sound_MESA.py: given an input audio (wav) file reproducing the sound of a waterfall, it computes the PSD and generate a synthetic noise, resembling the original one.
  • generate_white_noise.py: it samples white (gaussian) noise from the the power spectral density of advanced LIGO
  • doc_examples.py: gather all the piece of code used throughout the documentation, so that you can run all of them at once.

For more advanced use, you can use the code help functionalities:

import memspectrum
help(memspectrum)
help(memspectrum.<function_name>)

Usage of memspectrum

To compute the PSD, the following steps are required

  • Import the data
  • Import memspectrum and create an instance of MESA class:
from memspectrum import MESA
m = MESA()
  • Compute the autoregressive coefficients via the solve() method (required for further computations)
m.solve(data)
  • At this point you can compute the spectrum and forecast N future observations
spec, frequencies = m.spectrum(dt)
predicted_data = m.forecast(data, N)

Example

To compute (and plot) the spectrum of a (noisy) sinusoidal signal:

from memspectrum import MESA 
import numpy as np
import matplotlib.pyplot as plt

Generating the data:

N, dt = 1000, .01  #Number of samples and sampling interval
time = np.arange(0, N) * dt
frequency = 2  
data = np.sin(2 * np.pi * frequency * time) + np.random.normal(.4, size = 1000) 
plt.plot(time, data, color = 'k')

data

Solving MESA is needed to compute PSD or forecast.

M = MESA() 
M.solve(data) 

The spectrum can be computed on sampling frequencies (automatically generated) or on some given interval

spectrum, frequencies = M.spectrum(dt)  #Computes on sampling frequencies 
user_frequencies = np.linspace(1.5, 2.5)
user_spectrum = M.spectrum(dt, user_frequencies) #Computes on desired frequency grid

The two spectra look like

spectra

Forecasting

MESA can also be used to perform forecasting the future observation of a time series. For example, we consider the first 900 points of the data and try to infer the upcoming signal. 1000 simulations of 100 points are performed. Real observed data are compared with median estimate and 90% Credibility regions

M = MESA() 
M.solve(data[:-100]) 
forecast = M.forecast(data[:-100], length = 100, number_of_simulations = 1000, include_data = False) 
median = np.median(forecast, axis = 0) #Ensemble median 
p5, p95 = np.percentile(forecast, (5, 95), axis = 0) #90% credibility boundaries
	
plt.plot(time[:-100], data[:-100], color = 'k')
plt.fill_between(time[-100:], p5, p95, color = 'b', alpha = .5, label = '90% Cr.') 
plt.plot(time[-100:], data[-100:], color = 'k', linestyle = '-.', label = 'Observed data') 
plt.plot(time[-100:], median, color = 'r', label = 'median estimate') 

The forecast result is:

forecast

Whitening

The autoregressive coefficients come very handy to whiten the data: it's just a convolution between the data and the coefficient. This is implemented in function MESA.whiten:

white_data = M.whiten(data, trim = None)
plt.plot(time[M.get_p():-M.get_p()], white_data, color = 'k')

You can tune how to remove the edge effects by setting the trim option (it you set None, you will remove p points from the timeseries). Here's how the white data look like:

white_data

Generating data from PSD

The module memspectrum.GenerateTimeSeries provides a function that construct a time-series with a user-given power spectral density. It can be called as

from memspectrum.GenerateTimeSeries import generate_data
f, psd = (whathever psd and frequency array you like)
time, time_series, frequency, frequency_series,
psd = generate_data(f, psd, T, sampling_rate)

where T represents the time length of the observation and the sampling rate is equivalent to the inverse of the sampling interval.

References

About

This project is a master thesis of Alessandro Martini at the University of Pisa. A paper is published on ArXiv and it is currently under peer review.

If you feel that you need to know more about the code, or you just want to say hi, feel free to contact one of the authors.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

memspectrum-1.3.0.tar.gz (30.9 kB view details)

Uploaded Source

Built Distribution

memspectrum-1.3.0-py3-none-any.whl (28.9 kB view details)

Uploaded Python 3

File details

Details for the file memspectrum-1.3.0.tar.gz.

File metadata

  • Download URL: memspectrum-1.3.0.tar.gz
  • Upload date:
  • Size: 30.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for memspectrum-1.3.0.tar.gz
Algorithm Hash digest
SHA256 a16d3f665181c1221d1361c33d60d606331b3151cab55c2e1538d2512989df22
MD5 0d9bff3df0b8a0c48055b1171a521960
BLAKE2b-256 5007597eecce3ed12668c514dc8170268701e738655bf0d24c3e57b130109567

See more details on using hashes here.

File details

Details for the file memspectrum-1.3.0-py3-none-any.whl.

File metadata

  • Download URL: memspectrum-1.3.0-py3-none-any.whl
  • Upload date:
  • Size: 28.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for memspectrum-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 224f5335f1dc32837f6da35ddff3026e53a70e04f5f9cf5fbeea2b25cd678845
MD5 49336723e030020a78479b64f17be693
BLAKE2b-256 1c69c825f6139e7db954280af2a6252af584466fe0285b07fcc161e79d7097c3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page