Skip to main content

Python bindings for the protobuf zfits library

Project description

protozfits-python

Low-level reading and writing of zfits files using google protocol buffer objects.

To analyze data, you might be more interested in using a ctapipe plugin to load your data into ctapipe. There are currently several plugins using this library as a dependency for several CTA(O) prototypes:

Note: before version 2.4, the protozfits python library was part of the adh-apis Repository.

To improve maintenance, the two repositories were decoupled and this repository now only hosts the python bindings (protozfits). The needed C++ libZFitsIO is build from a git submodule of the adh-apis.

Table of Contents

Installation

Users

This package is published to PyPI and conda-forge. PyPI packages include pre-compiled manylinux wheels (no macOS wheels though) and conda packages are built for Linux and macOS.

When using conda, it's recommended to use the miniforge conda distribution, as it is fully open source and comes with the faster mamba package manager.

So install using:

pip install protozfits

or

mamba install protozfits

For development

This project is build using scikit-build-core, which supports editable installs recompiling the project on import by setting a couple of config-options for pip. See https://scikit-build-core.readthedocs.io/en/latest/configuration.html#editable-installs.

To setup a development environment, create a venv, install the build requirements and then run the pip install command with the options given below:

$ python3 -m venv venv
$ source venv/bin/activate
$ pip install 'scikit-build-core[pyproject]' pybind11 'setuptools_scm[toml]'
$ pip install -e '.[all]' --no-build-isolation

You can now e.g. run the tests:

$ pytest src

scikit-build-core will automatically recompile the project when importing the library. Some caveats remain though, see the scikit-build-core documentation linked above.

Usage

If you are just starting with proto-z-fits files and would like to explore the file contents, try this:

Open a file

>>> from protozfits import File
>>> example_path = 'protozfits/tests/resources/example_9evts_NectarCAM.fits.fz'
>>> file = File(example_path)
>>> file
File({
    'RunHeader': Table(1xDataModel.CameraRunHeader),
    'Events': Table(9xDataModel.CameraEvent)
})

From this we learn, the file contains two Table named RunHeader and Events which contains 9 rows of type CameraEvent. There might be more tables with other types of rows in other files. For instance LST has its RunHeader called CameraConfig.

Getting an event

Usually people just iterate over a whole Table like this:

for event in file.Events:
    # do something with the event
    pass

But if you happen to know exactly which event you want, you can also directly get an event, like this:

event_17 = file.Events[17]

You can also get a range of events, like this:

for event in file.Events[100:200]:
    # do something events 100 until 200
    pass

It is not yet possible to specify negative indices, like file.Events[:-10] does not work.

If you happen to have a list or any iterable or a generator with event ids you are interested in you can get the events in question like this:

interesting_event_ids = range(100, 200, 3)
for event in file.Events[interesting_event_ids]:
    # do something with intesting events
    pass

RunHeader

Even though there is usually only one run header per file, technically this single run header is stored in a Table. This table could contain multiple "rows" and to me it is not clear what this would mean... but technically it is possible.

At the moment I would recommend getting the run header out of the file we opened above like this (replace RunHeader with CameraConfig for LST data):

assert len(file.RunHeader) == 1
header = file.RunHeader[0]

For now, I will just get the next event

event = file.Events[0]
type(event)
<class 'protozfits.CameraEvent'>
event._fields
('telescopeID', 'dateMJD', 'eventType', 'eventNumber', 'arrayEvtNum', 'hiGain', 'loGain', 'trig', 'head', 'muon', 'geometry', 'hilo_offset', 'hilo_scale', 'cameraCounters', 'moduleStatus', 'pixelPresence', 'acquisitionMode', 'uctsDataPresence', 'uctsData', 'tibDataPresence', 'tibData', 'swatDataPresence', 'swatData', 'chipsFlags', 'firstCapacitorIds', 'drsTagsHiGain', 'drsTagsLoGain', 'local_time_nanosec', 'local_time_sec', 'pixels_flags', 'trigger_map', 'event_type', 'trigger_input_traces', 'trigger_output_patch7', 'trigger_output_patch19', 'trigger_output_muon', 'gps_status', 'time_utc', 'time_ns', 'time_s', 'flags', 'ssc', 'pkt_len', 'muon_tag', 'trpdm', 'pdmdt', 'pdmt', 'daqtime', 'ptm', 'trpxlid', 'pdmdac', 'pdmpc', 'pdmhi', 'pdmlo', 'daqmode', 'varsamp', 'pdmsum', 'pdmsumsq', 'pulser', 'ftimeoffset', 'ftimestamp', 'num_gains')
event.hiGain.waveforms.samples
array([241, 245, 248, ..., 218, 214, 215], dtype=int16)

An LST event will look something like so:

>>> event
CameraEvent(
    configuration_id=1
    event_id=1
    tel_event_id=1
    trigger_time_s=0
    trigger_time_qns=0
    trigger_type=0
    waveform=array([  0,   0, ..., 288, 263], dtype=uint16)
    pixel_status=array([ 0,  0,  0,  0,  0,  0,  0, 12, 12, 12, 12, 12, 12, 12], dtype=uint8)
    ped_id=0
    nectarcam=NectarCamEvent(
        module_status=array([], dtype=float64)
        extdevices_presence=0
        tib_data=array([], dtype=float64)
        cdts_data=array([], dtype=float64)
        swat_data=array([], dtype=float64)
        counters=array([], dtype=float64))
    lstcam=LstCamEvent(
        module_status=array([0, 1], dtype=uint8)
        extdevices_presence=0
        tib_data=array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=uint8)
        cdts_data=array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
               0, 0, 0, 0, 0, 0, 0, 0], dtype=uint8)
        swat_data=array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
               0, 0, 0, 0], dtype=uint8)
        counters=array([  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
                 0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
                 0,   0,   1,   0,   0,   0,  31,   0,   0,   0, 243, 170, 204,
                 0,   0,   0,   0,   0], dtype=uint8)
        chips_flags=array([    0,     0,     0,     0,     0,     0,     0,     0, 61440,
                 245, 61440,   250, 61440,   253, 61440,   249], dtype=uint16)
        first_capacitor_id=array([    0,     0,     0,     0,     0,     0,     0,     0, 61440,
                 251, 61440,   251, 61440,   241, 61440,   245], dtype=uint16)
        drs_tag_status=array([ 0, 12], dtype=uint8)
        drs_tag=array([   0,    0, ..., 2021, 2360], dtype=uint16))
    digicam=DigiCamEvent(
        ))
>>> event.waveform
array([  0,   0,   0, ..., 292, 288, 263], dtype=uint16)

event supports tab-completion, which I regard as very important while exploring. It is implemented using collections.namedtuple. I tried to create a useful string representation, it is very long, yes ... but I hope you can still enjoy it:

>>> event
CameraEvent(
    telescopeID=1
    dateMJD=0.0
    eventType=<eventType.NONE: 0>
    eventNumber=97750287
    arrayEvtNum=0
    hiGain=PixelsChannel(
        waveforms=WaveFormData(
            samples=array([241, 245, ..., 214, 215], dtype=int16)
            pixelsIndices=array([425, 461, ..., 727, 728], dtype=uint16)
            firstSplIdx=array([], dtype=float64)
            num_samples=0
            baselines=array([232, 245, ..., 279, 220], dtype=int16)
            peak_time_pos=array([], dtype=float64)
            time_over_threshold=array([], dtype=float64))
        integrals=IntegralData(
            gains=array([], dtype=float64)
            maximumTimes=array([], dtype=float64)
            tailTimes=array([], dtype=float64)
            raiseTimes=array([], dtype=float64)
            pixelsIndices=array([], dtype=float64)
            firstSplIdx=array([], dtype=float64)))
# [...]

Table header

fits.fz files are still normal FITS files and each Table in the file corresponds to a so called "BINTABLE" extension, which has a header. You can access this header like this:

>>> file.Events
Table(100xDataModel.CameraEvent)
>>> file.Events.header
# this is just a sulection of all the contents of the header
XTENSION= 'BINTABLE'           / binary table extension
BITPIX  =                    8 / 8-bit bytes
NAXIS   =                    2 / 2-dimensional binary table
NAXIS1  =                  192 / width of table in bytes
NAXIS2  =                    1 / number of rows in table
TFIELDS =                   12 / number of fields in each row
EXTNAME = 'Events'             / name of extension table
CHECKSUM= 'BnaGDmS9BmYGBmY9'   / Checksum for the whole HDU
DATASUM = '1046602664'         / Checksum for the data block
DATE    = '2017-10-31T02:04:55' / File creation date
ORIGIN  = 'CTA'                / Institution that wrote the file
WORKPKG = 'ACTL'               / Workpackage that wrote the file
DATEEND = '1970-01-01T00:00:00' / File closing date
PBFHEAD = 'DataModel.CameraEvent' / Written message name
CREATOR = 'N4ACTL2IO14ProtobufZOFitsE' / Class that wrote this file
COMPILED= 'Oct 26 2017 16:02:50' / Compile time
TIMESYS = 'UTC'                / Time system
>>> file.Events.header['DATE']
'2017-10-31T02:04:55'
>>> type(file.Events.header)
<class 'astropy.io.fits.header.Header'>

The header is provided by astropy.

pure protobuf mode

The library by default converts the protobuf objects into namedtuples and converts the AnyArray data type to numpy arrays. This has some runtime overhead. In case you for example know exactly what you want from the file, then you can get a speed-up by passing the pure_protob=True option:

>>> from protozfits import File
>>> file = File(example_path, pure_protobuf=True)
>>> event = next(file.Events)
>>> type(event)
<class 'ProtoDataModel_pb2.CameraEvent'>

Now iterating over the file is faster than before. But you have no tab-completion and some contents are less useful for you:

>>> event.eventNumber
97750288   # <--- just fine
>>> event.hiGain.waveforms.samples

type: S16
data: "\362\000\355\000 ... "   # <---- goes on "forever" .. raw bytes of the array data
>>> type(event.hiGain.waveforms.samples)
<class 'CoreMessages_pb2.AnyArray'>

You can convert these AnyArrays into numpy arrays like this:

>>> from protozfits import any_array_to_numpy
>>> any_array_to_numpy(event.hiGain.waveforms.samples)
array([242, 237, 234, ..., 218, 225, 229], dtype=int16)

Command-Line Tools

This module comes with a command-line tool that can re-compress zfits files using different options for the default and specific column compressions. This can also be used to extract the first N events from a large file, e.g. to produce smaller files for unit tests.

Usage:

$ python -m protozfits.recompress_zfits --help

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

protozfits-2.6.0.tar.gz (14.0 MB view details)

Uploaded Source

Built Distributions

protozfits-2.6.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.3 MB view details)

Uploaded CPython 3.13 manylinux: glibc 2.17+ x86-64

protozfits-2.6.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.3 MB view details)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

protozfits-2.6.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.3 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

protozfits-2.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.3 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

File details

Details for the file protozfits-2.6.0.tar.gz.

File metadata

  • Download URL: protozfits-2.6.0.tar.gz
  • Upload date:
  • Size: 14.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for protozfits-2.6.0.tar.gz
Algorithm Hash digest
SHA256 1b83a62449781a83bf6d75c67add49ddc6078032f2b3782de6a2e294cf55db12
MD5 4c88d6f839a6f3b9e23bbdfb6ad07208
BLAKE2b-256 2a537340ed2699fb0d9ac71b3196b001aa7f5552790bdc00ee37dda74768de1c

See more details on using hashes here.

File details

Details for the file protozfits-2.6.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for protozfits-2.6.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0cb35e8e759fe209140563025074bb9055200aa6d539f0306f25495700384d67
MD5 3c399414b7c7f6097ffd57db0944fde4
BLAKE2b-256 4b155a62316401233e0debfa362da2805c0536ebeebf777f87fd8463a3c6a06c

See more details on using hashes here.

File details

Details for the file protozfits-2.6.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for protozfits-2.6.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b70ac7a04a7c4a3d5cd9d3b5eb230782ad61c01ea781156bafc30c9653dcc3b0
MD5 8a42c968b3122d2b1bebe80c3a19966b
BLAKE2b-256 fd5b73d7fafff748c5bfc58c48d3c56da2d5f0d9297631a274c4c1a36c32bb21

See more details on using hashes here.

File details

Details for the file protozfits-2.6.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for protozfits-2.6.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 500a113b8fd19e834e469ea9716eb1f65d8c4f0980de74e9dfa4ed325e756e52
MD5 b6d7fa8468152accb5bf25edc4209ac3
BLAKE2b-256 32cd4a5b4f60db76ec0a3bb4a3531876ba62fb7f4f8cfbc40dbc9234b52ad827

See more details on using hashes here.

File details

Details for the file protozfits-2.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for protozfits-2.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d7fa1420f03c03d08aed358183da3a7e22fe253471f53c5613e4f1fc39f80ea3
MD5 8c0d2f69e4c59268e17840e7caaa9427
BLAKE2b-256 d325c438e08d25cf3feebdf50733b2662ed72be316bcc6a4971f486cbbaeccdd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page