Skip to main content

Arrow bindings for casacore

Project description

arcae implements a limited subset of functionality from the more mature python-casacore package. It bypasses some existing limitations in python-casacore to provide safe, multi-threaded access to CASA formats, thereby enabling export into newer cloud native formats such as Apache Arrow and Zarr.

Rationale

casacore and the python-casacore Python bindings provide access to the CASA Table Data System (CTDS) and Measurement Sets created within this system. The CTDS, as of casacore 3.5.0 is subject to the following limitations:

Resolving these concerns is potentially a major effort, involving invasive changes across the CTDS system.

In the time since the CTDS was developed, newer, open-source formats such as Apache Arrow and Zarr have been developed that are suitable for representing Radio Astronomy data.

  • The Apache Arrow project defines a programming language portable in-memory columnar storage format.

  • Translating CTDS data to Arrow is relatively simple, with some limitations mentioned below.

  • It’s easy to convert Arrow Tables between many different languages

  • Once in Apache Arrow format, it is easy to store data in modern, cloud-native disk formats such as parquet and Zarr.

  • Converting CASA Tables to Arrow in the C++ layer avoids the GIL

  • Access to non thread-safe CASA Tables is constrained to a ThreadPool containing a single thread

  • It also allows us to write astrometry routines in C++, potentially side-stepping thread-safety and GIL issues with the CASA Measures server.

Limitations

Arrow supports both 1D arrays and nested structures:

  1. Fixed shape multi-dimensional data (i.e. visibility data) is currently represented as nested FixedSizeListArrays .

  2. Variably-shaped multi-dimensional (i.e. subtable data) is currently represented as nested ListArrays.

  3. Complex values are represented as an extra FixedSizeListArray nesting of two floats.

  4. Currently, it is not trivially trivial (repetition intended here) to convert between the above and numpy via to_numpy calls on Arrow Arrays, but it is relatively trivial to reinterpret the underlying data buffers from either API. This is done transparently in getcol and putcol functions (see usage below).

Going forward, FixedShapeTensorArray and VariableShapeTensorArray will provide more ergonomic structures for representing multi-dimensional data. First class support for complex values in Apache Arrow will require implementing a C++ extension type within Arrow itself:

Some other edge cases have not yet been implemented, but could be with some thought.

  • Columns with unconstrained rank (ndim == -1) whose rows, in practice, have differing dimensions. Unconstrained rank columns whose rows actually have the same rank are catered for.

  • Not yet able to handle TpRecord columns. Probably simplest to convert these rows to json and store as a string.

  • Not yet able to handle TpQuantity columns. Possible to represent as a run-time parametric Arrow DataType.

Installation

Binary wheels are providing for Linux and MacOSX for both x86_64 and arm64 architectures

$ pip install arcae

Usage

Example usage with Arrow Tables:

import json
from pprint import pprint

import arcae
import pyarrow as pa
import pyarrow.parquet as pq

# Obtain (partial) Apache Arrow Table from a CASA Table
casa_table = arcae.table("/path/to/measurementset.ms")
arrow_table = casa_table.to_arrow()        # read entire table
arrow_table = casa_table.to_arrow(index=(slice(10, 20),)
assert isinstance(arrow_table, pa.Table)

# Print JSON-encoded Table and Column keywords
pprint(json.loads(arrow_table.schema.metadata[b"__arcae_metadata__"]))
pprint(json.loads(arrow_table.schema.field("DATA").metadata[b"__arcae_metadata__"]))

pq.write_table(arrow_table, "measurementset.parquet")

Some reading and writing functionality from python-casacore is replicated, with added support for some NumPy Advanced Indexing.

casa_table = arcae.table("/path/to/measurementset.ms", readonly=False)
# Get rows 10 and 2, and channels 16 to 32, and all correlations
data = casa_table.getcol("DATA", index=([10, 2], slice(16, 32), None))
# Write some modified data back
casa_table.putcol("DATA", data + 1*1j, index=([10, 2], slice(16, 32), None))

See the test cases for further use cases.

Exporting Measurement Sets to Arrow Parquet Datasets

Install the applications optional extra.

pip install arcae[applications]

Then, an export script is available:

$ arcae export /path/to/the.ms --nrow 50000
$ tree output.arrow/
output.arrow/
├── ANTENNA
   └── data0.parquet
├── DATA_DESCRIPTION
   └── data0.parquet
├── FEED
   └── data0.parquet
├── FIELD
   └── data0.parquet
├── MAIN
   └── FIELD_ID=0
       └── PROCESSOR_ID=0
           ├── DATA_DESC_ID=0
              ├── data0.parquet
              ├── data1.parquet
              ├── data2.parquet
              └── data3.parquet
           ├── DATA_DESC_ID=1
              ├── data0.parquet
              ├── data1.parquet
              ├── data2.parquet
              └── data3.parquet
           ├── DATA_DESC_ID=2
              ├── data0.parquet
              ├── data1.parquet
              ├── data2.parquet
              └── data3.parquet
           └── DATA_DESC_ID=3
               ├── data0.parquet
               ├── data1.parquet
               ├── data2.parquet
               └── data3.parquet
├── OBSERVATION
   └── data0.parquet

This data can be loaded into an Arrow Dataset:

>>> import pyarrow as pa
>>> import pyarrow.dataset as pad
>>> main_ds = pad.dataset("output.arrow/MAIN")
>>> spw_ds = pad.dataset("output.arrow/SPECTRAL_WINDOW")

Etymology

Noun: arca f (genitive arcae); first declension A chest, box, coffer, safe (safe place for storing items, or anything of a similar shape)

Pronounced: ar-ki.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arcae-0.4.0a1.tar.gz (127.7 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

arcae-0.4.0a1-cp313-cp313-manylinux_2_28_x86_64.whl (34.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

arcae-0.4.0a1-cp313-cp313-manylinux_2_28_aarch64.whl (31.3 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

arcae-0.4.0a1-cp313-cp313-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.13macOS 14.0+ ARM64

arcae-0.4.0a1-cp313-cp313-macosx_13_0_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.13macOS 13.0+ x86-64

arcae-0.4.0a1-cp312-cp312-manylinux_2_28_x86_64.whl (34.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

arcae-0.4.0a1-cp312-cp312-manylinux_2_28_aarch64.whl (31.3 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

arcae-0.4.0a1-cp312-cp312-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.12macOS 14.0+ ARM64

arcae-0.4.0a1-cp312-cp312-macosx_13_0_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.12macOS 13.0+ x86-64

arcae-0.4.0a1-cp311-cp311-manylinux_2_28_x86_64.whl (33.9 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

arcae-0.4.0a1-cp311-cp311-manylinux_2_28_aarch64.whl (31.2 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

arcae-0.4.0a1-cp311-cp311-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.11macOS 14.0+ ARM64

arcae-0.4.0a1-cp311-cp311-macosx_13_0_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.11macOS 13.0+ x86-64

arcae-0.4.0a1-cp310-cp310-manylinux_2_28_x86_64.whl (33.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

arcae-0.4.0a1-cp310-cp310-manylinux_2_28_aarch64.whl (31.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

arcae-0.4.0a1-cp310-cp310-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.10macOS 14.0+ ARM64

arcae-0.4.0a1-cp310-cp310-macosx_13_0_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.10macOS 13.0+ x86-64

File details

Details for the file arcae-0.4.0a1.tar.gz.

File metadata

  • Download URL: arcae-0.4.0a1.tar.gz
  • Upload date:
  • Size: 127.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for arcae-0.4.0a1.tar.gz
Algorithm Hash digest
SHA256 d6795af617ea1bea4d52b11c0522c612a6422a590046690b69a37f133c097525
MD5 d22dce1b0790755e4797c674ced84e58
BLAKE2b-256 b72895b47872c93466d67debaa8304f1a9da6ff10adfa32592a4f08cfa331bc9

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a1.tar.gz:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a1-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a1-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 081246146911cab9008a01c838c5ed20aa17c1320dc6ee358f33d897ff6b27c2
MD5 10cb20309239fcd2cae957074ce39e74
BLAKE2b-256 c4bafcc624b8efa22b1cdce984fc651090c0deda53885d9b5833db6e4d717188

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a1-cp313-cp313-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a1-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a1-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 119e8b3c0060f20c0a17fecda89ff3bbc720e5cef84ff0d1cf4dfbd1b865997b
MD5 fdc97331e076ae4f88b1b46657b6c1b7
BLAKE2b-256 69b829c98e99a52893acecb14b16afcea68429ded630ea6c0a109f4d079da7b7

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a1-cp313-cp313-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a1-cp313-cp313-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a1-cp313-cp313-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 3d416883fe54e9cf6fe8e60c95aede04477edc0b2e847231a135777d7bbe95fc
MD5 4317f9bde7f52c505db75c4d19e8da06
BLAKE2b-256 869b712a12eb5bbb90b531fece930e34bcea50710f6de8c30913c16e3512e1db

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a1-cp313-cp313-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a1-cp313-cp313-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a1-cp313-cp313-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 6418e2e802c1e026ece88dd668b066ef43e54eac6b4bac27835d12792c04cc31
MD5 050f27f92a68e92a7d9c017174ab2596
BLAKE2b-256 265a66713a29c9706fe777ad43f305aa73649622780c982cdaa352ed4946d863

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a1-cp313-cp313-macosx_13_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a1-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a1-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f8c382e82d7de5705d9d6c2322824d011037a6694a57b5f4091182207842f930
MD5 a0bc6db9cd5d8489fb376aec1d0ae0f2
BLAKE2b-256 dc8d70d8fa09e3496e77274ce11462badecb5ab5fc864087d1d03f69479f0a35

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a1-cp312-cp312-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a1-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a1-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 42ab644a28c6d549434ce43810c5ce700877f8401c65a6dd6f7b8ea53141e321
MD5 facca6d9e0a37a9935ae1569cedc34fc
BLAKE2b-256 b87fea85b6cbca5fdb8d9a3a9e00b0ded9120dc7ffacec268fcd6a0814491909

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a1-cp312-cp312-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a1-cp312-cp312-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a1-cp312-cp312-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 cdf5fc5e5a4e65625412aeca3ddb5879bc835972304fe0d8b2dfc25be02bc81b
MD5 4361357b7c1d7bdd159dc32cde2eb9eb
BLAKE2b-256 b664149c22eb12f9f61d5cc1cf34225a612c21fa776f507d7aef1c523c2e40c8

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a1-cp312-cp312-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a1-cp312-cp312-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a1-cp312-cp312-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 e1adfe86d886708311eff35e0390f88525c13de3ab58d0be6d3832ee434ffad2
MD5 6f482636fe09c6cc35932ac10fbb3de5
BLAKE2b-256 1da71c7c53839f267f986d5f922eaa8f944b129a36667e995099e2f4643cb5ea

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a1-cp312-cp312-macosx_13_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a1-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a1-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 556196475785c4f71da2e851458a3d0ef1534b57144fd6cf9fb4fe80fd1a2d50
MD5 2e99bbceb82011f6e0723bd4e539dc7d
BLAKE2b-256 1b1e61403b381b793fdf4f84c58799c656789c9c1dc7fc3ee1bc3b8adb8060c5

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a1-cp311-cp311-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a1-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a1-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 27c6bc62d7441f656f88ba6c23fef77f32c023a17b0efbcbf03a3888d3cf571e
MD5 33e479d1cbeb657f251d2ad47217deb3
BLAKE2b-256 a08c5de6853f0355cf1a87fc95e84fbe8c23540cd46d0f18787f7d7e8ddb7c0c

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a1-cp311-cp311-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a1-cp311-cp311-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a1-cp311-cp311-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 f0d9d2bdaa4cbf54dc1de955922537684941580755e328241638ecc23827556e
MD5 5ca4909b698600d5e3a87c8deff42532
BLAKE2b-256 3c30af2caba0f547b36f3351e01cf52bf536c66ba278cd6eda116272bdf0c384

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a1-cp311-cp311-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a1-cp311-cp311-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a1-cp311-cp311-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 a362366f3018858a760d8206720ad2f3560cb36eadcaee1e99a1b5e3718164db
MD5 d9420c46a3792bbc16e72a26a7a3521f
BLAKE2b-256 d8328000a46cb17c1d4609fc0bcc81b74469c6facb49b02d3d381231d4195d3c

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a1-cp311-cp311-macosx_13_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a1-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a1-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 b4715e78a107bf6cee075f28e70bac06e2719bef6bc6c9bcac9afe6d7e77a1e3
MD5 663a2e9979945dede32dd21791ed94ba
BLAKE2b-256 4bd372c6bac048d981307c5e526b214cf9641f7043ec74188e4cb3a2765e527b

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a1-cp310-cp310-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a1-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a1-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 4422d26846d539aef48ba0ebc568470a0eb14620f7525723c867167145685d77
MD5 799c5340d9e3fdbf8d2ccc7e85f3cefe
BLAKE2b-256 ce881490fb307acb865a7b20044845aa801c52b1f235b3b73a7253ec5a29ca35

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a1-cp310-cp310-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a1-cp310-cp310-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a1-cp310-cp310-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 ce9c2f112f471a4a880b5ca2a912e49741d10c3999a1fa9661aceec0c58d79cb
MD5 34d1ad6c45773976ff9444dff8f1ebbf
BLAKE2b-256 b6b9dedbf7a539ff78ddf3437fd5831d55b6ff712d935a695ebf7ddfa846a7dc

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a1-cp310-cp310-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a1-cp310-cp310-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a1-cp310-cp310-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 608c8df6ffecfebc4991f2cf5f6a09d981e67fbcdd12cc29cd58a3e740b8c0c1
MD5 9bcc4eb9a1dda0247a56c8af65bf2c33
BLAKE2b-256 ba76d8b7f32161619f4c0fa650d7f8424789a04f1a5a621d17ee2b03517a0e58

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a1-cp310-cp310-macosx_13_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page