A binary timeseries storage format, where the time axis is given via an expression.
Project description
BinaryTimeseries
A binary timeseries storage format, where the time axis is given via an expression.
Scope
This is the specification for a really simple binary file format for storing a regularlyspaced sequence of
singlechannel measurement data in an efficiently writeable and readable format. The basic assumption
is that the time axis t_i
of a series of N
measurements can be computed on the fly from the array indices:
for (int i=0; i<N; i++) {
t_i = t_0 + i * Delta_t;
}
where t_0
is the (reference) timestamp of the first sample and Delta_t
is the sampling interval.
The data values y_i
are stored as raw values y_i_raw
, optionally with an offset scalingOffset
and a scaling factor scalingFactor
:
for (int i=0; i<N; i++) {
if (hasScaling) {
y_i = y_i_raw;
} else {
y_i = scalingOffset + y_i_raw * scalingFactor;
}
}
The maximum number of samples that can be stored inside this file format is limited by the maximum value of the (signed) int
type,
which is
Integer.MAX_VALUE == 2^31  1 == 2_147_483_647 \approx 2.1e9
This corresponds to a total duration of T_max = (2^311) * Delta_t
.
In the case of raw double
values as y_i_raw
, the corresponding maximum file size that can occur is
(64 + Double.BYTES * 2_147_483_647) == (64+8*(2^311)) \approx 16 GB
where 64 bytes are reserved for the file header information.
Suppose an ADC samples at a frequency f = 1 MHz
. Then, the sampling interval is Delta_t = 1/f = 1 µs
and the maximum time series length that can be stored in one file in this file format is T_max \approx 2147 s
.
The recommended file name extension for this file format is *.bts
for Binary Time Series.
Fast subset reading
The main goal of this file format is to allow easy and fast reading of subsets of the whole time series data. Having an equally spaced time axis allows to compute the data indices inside a given time interval and using the definitions in Sec. 3 of the documentation (see below), the offsets in the file can be computed for seeking to the computed position in the file and reading only the required data from there on.
Documentation
The specification of this file format is available as a PDF in this repository: Binary Timeseries File Format Specification.
The LaTeX source code and the compiled PDF of this specification are also embedded (as resources) in the jar
of the Java implementation on Maven Central.
Implementation
A Java implementation of this file format using a ByteBuffer
as the file abstraction layer is available in this repository.
The latest release is available on Maven Central:
<dependency>
<groupId>de.labathome</groupId>
<artifactId>BinaryTimeseries</artifactId>
<version>1.0.4</version>
</dependency>
A (currently readonly) Python implementation of this file format is available on PyPI:
pip install BinaryTimeseries
Useage
A starting point on how to use these classes is given in the following example files:
Project details
Release history Release notifications  RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for BinaryTimeseries1.0.4py3noneany.whl
Algorithm  Hash digest  

SHA256  855e841c4cd3583b6874219471c1ffa45b44b33fd2d8992c67595bb3cec7f4fe 

MD5  90263fbcac63dab132d6d5cf47a7a03a 

BLAKE2b256  23a35227125b732835cc1cb2c917b1c73f04c0327259097f9b91c3085d15b064 