Skip to main content

TimeSeries Extensions for SGN Framework

Project description

SGN-TS (SGN TimeSeries)

SGN-TS is set of extensions to the core library sgn, that includes functionality specific to TimeSeries analysis. This page is for documenatation of the sgnts package, but there is a family of libraries that extend the functionality of SGN in other ways, including:

  • sgn: Base library for SGN
  • sgn-ligo: LIGO-specific utilities for SGN

Installation

To install SGN-TS, simply run:

pip install sgn-ts

More SGN-TS-specific documentation coming soon.

Developer's guide

Before reading this guide you should carefully read and understand the SGN developers guide.

The core motivation with SGN TS (sgnts) is to build Time Series (TS) handling into SGN. This is appropriate for e.g., signal processing applications. Of course nothing is stopping you from doing any of these things with just SGN, but you will likely have to deal with some of the conceptual and technical hurdles that this library solves. That being said, there are many limitations of sgnts and you should understand those carefully in the context of your project. We are open to making changes that reach a wider audience, so please let us know your thoughts.

New concepts over SGN:

  • Data are now rigidly defined to be uniformly sampled time series. There is an expectation that elements will deal with data in a synchronous way.
  • Synchronization means that the continuity equation must be satisfied. Data cannot be produced at a higher rate in one source element than another, otherwise synchronous operations will be impossible without data "piling up" somewhere.
  • Time stamp bookeeping accuracy is important. The library aims to keep single sample point timing accuracies even for applications that are designed to run uninterupped for years. This requires a bit of rigidity in bookeeping, but we try to hide as much as possible from the causual developer and user.

Buffers and Frames

The most important new class in sgnts is the TSFrame which holds a list of SeriesBuffers

Here we can get some familiarity with both of these objects and along the way, other classes and concepts relevant for sgnts.

>>> import numpy
>>> from sgnts.base.buffer import SeriesBuffer
>>> buf = SeriesBuffer(offset=0, sample_rate=2048, data=numpy.random.randn(2048))
>>> print (buf)
SeriesBuffer(offset=0, offset_end=16384, shape=(2048,), sample_rate=2048, duration=1000000000, data=[0.56649291 ... 1.39569688])

There is plenty to unpack here, so lets go step by step.

offset:

offset is globally meaningful throughout the application and acts as a precise surrogate for time, i.e., an absolute "time" reference for any element within an sgnts application that should not suffer from any rounding error. Technically offsets are defined as a cumulative number of samples passed defined at the maximum sample rate allowed by the application. This will be explained more below.

sample_rate:

sample_rate is the number of samples per second that a stretch of data contains. It is used to convert to actual time with nanosecond precision. In order to make certain gaurantees about precision in sgnts, we currently only support power of 2 sample rates from 1 Hz to a maximum which defaults to 16384 Hz. The max sample rate and allowed rates are defined here.

data:

data is generally a numpy array that can be interpreted as (possibly multidimensional) time series data.

Now revisiting the above

>>> buf = SeriesBuffer(offset=0, sample_rate=2048, data=numpy.random.randn(2048))
>>> print (buf)
SeriesBuffer(offset=0, offset_end=16384, shape=(2048,), sample_rate=2048, duration=1000000000, data=[0.56649291 ... 1.39569688])

we see the following. The user specified data as a 2048 sample long set of random gaussian distributed numbers. Since the sample_rate is also 2048 seconds, this is interpreted as 1 second of time series data. When printing the buffer you can see duration=1000000000 which is equal to 1e9 nanoseconds (time is stored as integer nanoseconds). You can see offset_end=16384 which indicates the number of samples that would be in this data if it where at the maximum sample rate. That is what an offset defines -- a sample count assuming max sample rate. It is critical for accurate internal bookkeeping. You also see shape=(2048,) which indicates single channel time series. Try the following for an example of multichannel audio:

>>> buf = SeriesBuffer(offset=0, sample_rate=2048, data=numpy.random.randn(2,2048))
>>> print (buf)
SeriesBuffer(offset=0, offset_end=16384, shape=(2, 2048), sample_rate=2048, duration=1000000000, data=[[ 0.01684876 ... -1.6963346 ]
 [-0.55875476 ...  0.58967178]])

Note what happens to the offset if you change the sample rate (and in this case also the data size)

>>> buf = SeriesBuffer(offset=0, sample_rate=1024, data=numpy.random.randn(2,1024))
>>> buf
SeriesBuffer(offset=0, offset_end=16384, shape=(2, 1024), sample_rate=1024, duration=1000000000, data=[[-0.13116052 ...  1.2223811 ]
 [-0.98786954 ... -0.56760618]])

It stays the same. Remember that the offset is the sample count at the theoretical maximum sample rate which is defined in offset.py.

Only power of two sample rates are allowed at present to ensure that bookeeping remains simple and accurate.

>>> buf = SeriesBuffer(offset=0, sample_rate=1000, data=numpy.random.randn(2,1000))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 7, in __init__
  File "/Users/crh184/Library/Python/3.9/lib/python/site-packages/sgnts/base/buffer.py", line 38, in __post_init__
    raise ValueError("%s not in allowed rates %s" % (self.sample_rate, Offset.ALLOWED_RATES))
ValueError: 1000 not in allowed rates {32, 1, 2, 64, 4, 128, 256, 512, 8, 1024, 2048, 4096, 8192, 16, 16384}

It is possible to increase the maximum sample rate globally in an application

>>> import numpy
>>> from sgnts.base.buffer import SeriesBuffer
>>> from sgnts.base.offset import Offset
>>> Offset.set_max_rate(262144)
>>> buf = SeriesBuffer(offset=0, sample_rate=32768, data=numpy.random.randn(32768))
>>> print (buf)
SeriesBuffer(offset=0, offset_end=262144, shape=(32768,), sample_rate=32768, duration=1000000000, data=[-0.08916502 ...  0.89236118])

Buffers are not the primary data type passed around between element in sgnts. Rather, it is a TSFrame. TSFrames hold lists of buffers

>>> import numpy
>>> from sgnts.base.buffer import SeriesBuffer, TSFrame
>>> 
>>> # An example of just one buffer
>>> buf1 = SeriesBuffer(offset=0, sample_rate=2048, data=numpy.random.randn(2048))
>>> frame = TSFrame(buffers=[buf1])
>>> print (frame)

	SeriesBuffer(offset=0, offset_end=16384, shape=(2048,), sample_rate=2048, duration=1000000000, data=[-0.04094335 ... -1.49758223])
>>> 
>>> # An example of two contiguous buffers
>>> buf1 = SeriesBuffer(offset=0, sample_rate=2048, data=numpy.random.randn(2048))
>>> buf2 = SeriesBuffer(offset=16384, sample_rate=2048, data=numpy.random.randn(2048))
>>> frame = TSFrame(buffers=[buf1, buf2])
>>> print (frame)

	SeriesBuffer(offset=0, offset_end=16384, shape=(2048,), sample_rate=2048, duration=1000000000, data=[-1.56771352 ... -0.20928693])
	SeriesBuffer(offset=16384, offset_end=32768, shape=(2048,), sample_rate=2048, duration=1000000000, data=[-1.00442217 ... -0.75684022])
>>> 
>>> # An example of two non contiguous buffers. NOTE THIS SHOULDN'T WORK!!
>>> buf1 = SeriesBuffer(offset=0, sample_rate=2048, data=numpy.random.randn(2048))
>>> buf2 = SeriesBuffer(offset=12345, sample_rate=2048, data=numpy.random.randn(2048))
>>> frame = TSFrame(buffers=[buf1, buf2])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 8, in __init__
  File "/Users/crh184/Library/Python/3.9/lib/python/site-packages/sgnts/base/buffer.py", line 455, in __post_init__
    self.__sanity_check(self.buffers)
  File "/Users/crh184/Library/Python/3.9/lib/python/site-packages/sgnts/base/buffer.py", line 485, in __sanity_check
    assert off0 == sl.start
AssertionError

Note in the above that TSFrames only support contiguous buffers

TSFrames offer some additional methods to describe their contents, e.g.,

>>> buf1 = SeriesBuffer(offset=0, sample_rate=2048, data=numpy.random.randn(2048))
>>> buf2 = SeriesBuffer(offset=16384, sample_rate=2048, data=numpy.random.randn(2048))
>>> frame = TSFrame(buffers=[buf1, buf2])
>>> 
>>> # Get the offset of the first buffer
>>> print (frame.offset)
0
>>> 
>>> # Get the offset end of the last buffer
>>> print (frame.end_offset)
32768
>>> 
>>> # Get the sample rate
>>> print (frame.sample_rate)
2048
>>> 
>>> # Iterate over the buffers
>>> for buf in frame:
...     print (buf)
... 
SeriesBuffer(offset=0, offset_end=16384, shape=(2048,), sample_rate=2048, duration=1000000000, data=[0.01658589 ... 0.76543937])
SeriesBuffer(offset=16384, offset_end=32768, shape=(2048,), sample_rate=2048, duration=1000000000, data=[0.76470737 ... 0.89438121])

TSFrames must be initialized with at least one buffer because metadata are derived from the buffer(s). If you want to have an empty frame, you still have to set one buffer with the correct metadata, e.g.,

>>> # empty buffer
>>> buf = SeriesBuffer(offset=0, sample_rate=2048, shape=(2048,), data=None)
>>> frame = TSFrame(buffers=[buf])

Advanced TSFrame techniques

There are shortcuts for producing a new empty TSFrame that might be useful if your goal is to just spit out some similar empty frames to fill in, e.g.,

>>> frame = TSFrame.from_buffer_kwargs(offset=0, sample_rate=2048, shape=(2048,))
>>> print (frame)

	SeriesBuffer(offset=0, offset_end=16384, shape=(2048,), sample_rate=2048, duration=1000000000, data=None)
>>> print (next(frame))

	SeriesBuffer(offset=16384, offset_end=32768, shape=(2048,), sample_rate=2048, duration=1000000000, data=None)

Writing a new source element

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sgn_ts-0.2.0.tar.gz (121.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sgn_ts-0.2.0-py3-none-any.whl (58.6 kB view details)

Uploaded Python 3

File details

Details for the file sgn_ts-0.2.0.tar.gz.

File metadata

  • Download URL: sgn_ts-0.2.0.tar.gz
  • Upload date:
  • Size: 121.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for sgn_ts-0.2.0.tar.gz
Algorithm Hash digest
SHA256 75ac1ecabb9d9d69d5137e75817d8fe0cce7807cab529ac4b38c8855044d7baa
MD5 9faf52e7e44feccd589c582d57609f9c
BLAKE2b-256 379cd5bff7721816d8657360d1f90f773ffb4aa9411eb476947b11767b81572e

See more details on using hashes here.

File details

Details for the file sgn_ts-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: sgn_ts-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 58.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for sgn_ts-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b7a12076a50faef7d6d1126bef06f70dbceece1fc1deffc82b0830894e171622
MD5 06c2638cd0386f0701c6b1869887e984
BLAKE2b-256 f34ee422f959233b5fe7c6e78bbe971f4f7c74bacb1d05cec1ca2eeef83696da

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page