Skip to main content

Arrow bindings for casacore

Project description

arcae implements a limited subset of functionality from the more mature python-casacore package. It bypasses some existing limitations in python-casacore to provide safe, multi-threaded access to CASA formats, thereby enabling export into newer cloud native formats such as Apache Arrow and Zarr.

Rationale

casacore and the python-casacore Python bindings provide access to the CASA Table Data System (CTDS) and Measurement Sets created within this system. The CTDS, as of casacore 3.5.0 is subject to the following limitations:

Resolving these concerns is potentially a major effort, involving invasive changes across the CTDS system.

In the time since the CTDS was developed, newer, open-source formats such as Apache Arrow and Zarr have been developed that are suitable for representing Radio Astronomy data.

  • The Apache Arrow project defines a programming language portable in-memory columnar storage format.

  • Translating CTDS data to Arrow is relatively simple, with some limitations mentioned below.

  • It’s easy to convert Arrow Tables between many different languages

  • Once in Apache Arrow format, it is easy to store data in modern, cloud-native disk formats such as parquet and Zarr.

  • Converting CASA Tables to Arrow in the C++ layer avoids the GIL

  • Access to non thread-safe CASA Tables is constrained to a ThreadPool containing a single thread

  • It also allows us to write astrometry routines in C++, potentially side-stepping thread-safety and GIL issues with the CASA Measures server.

Limitations

Arrow supports both 1D arrays and nested structures:

  1. Fixed shape multi-dimensional data (i.e. visibility data) is currently represented as nested FixedSizeListArrays .

  2. Variably-shaped multi-dimensional (i.e. subtable data) is currently represented as nested ListArrays.

  3. Complex values are represented as an extra FixedSizeListArray nesting of two floats.

  4. Currently, it is not trivially trivial (repetition intended here) to convert between the above and numpy via to_numpy calls on Arrow Arrays, but it is relatively trivial to reinterpret the underlying data buffers from either API. This is done transparently in getcol and putcol functions (see usage below).

Going forward, FixedShapeTensorArray and VariableShapeTensorArray will provide more ergonomic structures for representing multi-dimensional data. First class support for complex values in Apache Arrow will require implementing a C++ extension type within Arrow itself:

Some other edge cases have not yet been implemented, but could be with some thought.

  • Columns with unconstrained rank (ndim == -1) whose rows, in practice, have differing dimensions. Unconstrained rank columns whose rows actually have the same rank are catered for.

  • Not yet able to handle TpRecord columns. Probably simplest to convert these rows to json and store as a string.

  • Not yet able to handle TpQuantity columns. Possible to represent as a run-time parametric Arrow DataType.

Installation

Binary wheels are providing for Linux and MacOSX for both x86_64 and arm64 architectures

$ pip install arcae

Usage

Example usage with Arrow Tables:

import json
from pprint import pprint

import arcae
import pyarrow as pa
import pyarrow.parquet as pq

# Obtain (partial) Apache Arrow Table from a CASA Table
casa_table = arcae.table("/path/to/measurementset.ms")
arrow_table = casa_table.to_arrow()        # read entire table
arrow_table = casa_table.to_arrow(index=(slice(10, 20),)
assert isinstance(arrow_table, pa.Table)

# Print JSON-encoded Table and Column keywords
pprint(json.loads(arrow_table.schema.metadata[b"__arcae_metadata__"]))
pprint(json.loads(arrow_table.schema.field("DATA").metadata[b"__arcae_metadata__"]))

pq.write_table(arrow_table, "measurementset.parquet")

Some reading and writing functionality from python-casacore is replicated, with added support for some NumPy Advanced Indexing.

casa_table = arcae.table("/path/to/measurementset.ms", readonly=False)
# Get rows 10 and 2, and channels 16 to 32, and all correlations
data = casa_table.getcol("DATA", index=([10, 2], slice(16, 32), None))
# Write some modified data back
casa_table.putcol("DATA", data + 1*1j, index=([10, 2], slice(16, 32), None))

See the test cases for further use cases.

Exporting Measurement Sets to Arrow Parquet Datasets

Install the applications optional extra.

pip install arcae[applications]

Then, an export script is available:

$ arcae export /path/to/the.ms --nrow 50000
$ tree output.arrow/
output.arrow/
├── ANTENNA
   └── data0.parquet
├── DATA_DESCRIPTION
   └── data0.parquet
├── FEED
   └── data0.parquet
├── FIELD
   └── data0.parquet
├── MAIN
   └── FIELD_ID=0
       └── PROCESSOR_ID=0
           ├── DATA_DESC_ID=0
              ├── data0.parquet
              ├── data1.parquet
              ├── data2.parquet
              └── data3.parquet
           ├── DATA_DESC_ID=1
              ├── data0.parquet
              ├── data1.parquet
              ├── data2.parquet
              └── data3.parquet
           ├── DATA_DESC_ID=2
              ├── data0.parquet
              ├── data1.parquet
              ├── data2.parquet
              └── data3.parquet
           └── DATA_DESC_ID=3
               ├── data0.parquet
               ├── data1.parquet
               ├── data2.parquet
               └── data3.parquet
├── OBSERVATION
   └── data0.parquet

This data can be loaded into an Arrow Dataset:

>>> import pyarrow as pa
>>> import pyarrow.dataset as pad
>>> main_ds = pad.dataset("output.arrow/MAIN")
>>> spw_ds = pad.dataset("output.arrow/SPECTRAL_WINDOW")

Etymology

Noun: arca f (genitive arcae); first declension A chest, box, coffer, safe (safe place for storing items, or anything of a similar shape)

Pronounced: ar-ki.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arcae-0.4.0a3.tar.gz (132.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

arcae-0.4.0a3-cp313-cp313-manylinux_2_28_x86_64.whl (34.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

arcae-0.4.0a3-cp313-cp313-manylinux_2_28_aarch64.whl (31.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

arcae-0.4.0a3-cp313-cp313-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.13macOS 14.0+ ARM64

arcae-0.4.0a3-cp313-cp313-macosx_13_0_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.13macOS 13.0+ x86-64

arcae-0.4.0a3-cp312-cp312-manylinux_2_28_x86_64.whl (34.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

arcae-0.4.0a3-cp312-cp312-manylinux_2_28_aarch64.whl (31.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

arcae-0.4.0a3-cp312-cp312-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.12macOS 14.0+ ARM64

arcae-0.4.0a3-cp312-cp312-macosx_13_0_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.12macOS 13.0+ x86-64

arcae-0.4.0a3-cp311-cp311-manylinux_2_28_x86_64.whl (34.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

arcae-0.4.0a3-cp311-cp311-manylinux_2_28_aarch64.whl (31.4 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

arcae-0.4.0a3-cp311-cp311-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.11macOS 14.0+ ARM64

arcae-0.4.0a3-cp311-cp311-macosx_13_0_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.11macOS 13.0+ x86-64

arcae-0.4.0a3-cp310-cp310-manylinux_2_28_x86_64.whl (34.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

arcae-0.4.0a3-cp310-cp310-manylinux_2_28_aarch64.whl (31.4 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

arcae-0.4.0a3-cp310-cp310-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.10macOS 14.0+ ARM64

arcae-0.4.0a3-cp310-cp310-macosx_13_0_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.10macOS 13.0+ x86-64

File details

Details for the file arcae-0.4.0a3.tar.gz.

File metadata

  • Download URL: arcae-0.4.0a3.tar.gz
  • Upload date:
  • Size: 132.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for arcae-0.4.0a3.tar.gz
Algorithm Hash digest
SHA256 0883a4176416f88c2453583b6a080a2a5a599c1544c5e0a66eae097e5c3b63ad
MD5 4e4f06919ad1ff45d60b1e6fa266d30e
BLAKE2b-256 1e46026cd8757fea8c6faab9da981ccc1d3f624e5d8af2415e13bb6c079c6605

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a3.tar.gz:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a3-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a3-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 abb06f315aec44991897e9bb703a7421a40a36f0d58836c852a28a44297e41c4
MD5 49f2472d59be4440e42766f0e152f74a
BLAKE2b-256 250aef269c01809b606ea5f3cd6c40919fae92ac795ae1645ff1da8542ea0f8b

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a3-cp313-cp313-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a3-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a3-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 20773f92eadf0e8d17cd0b325deb38673c96177df9212bd9890d9eec5e64d808
MD5 2e0042e51d34fd88df29f31e5a89d915
BLAKE2b-256 b4cbcec049052739dc88c2d28753bb2996ee3a42f21b069edacb45eec34b26ae

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a3-cp313-cp313-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a3-cp313-cp313-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a3-cp313-cp313-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 5a8069e8c3a4e025fd916e82508bb3d14a44fb25e7ad014d6d3799413cb50a04
MD5 9fa38fcdbf755c78598014de929ce7d4
BLAKE2b-256 69441192361e105f980a7922702abb2778b17788e54ccfb1db7416393cb18f71

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a3-cp313-cp313-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a3-cp313-cp313-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a3-cp313-cp313-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 b9e177d96b525f35e4848914e1fe7f7759fb0344c23395c44cb1834a34db322c
MD5 d7509bd8ac23dcd91b53c49a716fba27
BLAKE2b-256 1b27492a6afa65d722a0b293f4ed967e03f30fb40903a6397ee20e4e7cd393d6

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a3-cp313-cp313-macosx_13_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a3-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a3-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 e7e21bed55a530f5d3892a9e9f5eaf5bc3cc72d0af2bffed72bb9a649452fccb
MD5 c0caea49538db964be513bd6ef0b63c7
BLAKE2b-256 edf28042a20c66bb84353aa640ab7aa27924b742f0bb1e60eb9a01d024c6e081

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a3-cp312-cp312-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a3-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a3-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 dad547be896748ea5199d652e1d95bf4a251661a92c2bfa49732aa097dcc7d85
MD5 8e1f462e094a5d0474b1e610703d37a6
BLAKE2b-256 734a787eca564fcea0a4791b5ff18693e9670aeed98b2d8af570e96ed1dd3384

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a3-cp312-cp312-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a3-cp312-cp312-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a3-cp312-cp312-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 cb6366946002bd2b0276528218414e4686e47caefd72f573f167afe89c57a672
MD5 1c60a6acfaf79341045072d3a0fd37a5
BLAKE2b-256 6dfb2a0725a573a2d5e7800661d750b9dd6fc20768e1373fb2a24851503c8abe

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a3-cp312-cp312-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a3-cp312-cp312-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a3-cp312-cp312-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 cbd26955aeacaba6b30de6f5969cf4118aefcf5c021420aa3b140d3983382d9f
MD5 d597ae437a9f9c8fb9f400186b0ea56b
BLAKE2b-256 51eed53b5f152d12b0e20f15e5d5fe124949fce1942c072c0ac159d04376fe75

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a3-cp312-cp312-macosx_13_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a3-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a3-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f6400c45923c6bf7b0650be181971b3017dc35ac869eb5f71646f1051e9043c4
MD5 062206b80ec8e3683e6e63fa15a2ca82
BLAKE2b-256 294ace3c825c421863aace0a329c5e6075686bb83334d7271dd1a7e598f156b2

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a3-cp311-cp311-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a3-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a3-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 8d81aac4b0f872f47586eeb7a1dd352497416c80fb9f27c395739a4d0c1efdd3
MD5 d09d07795b5f71d7da7867ae19f2ffea
BLAKE2b-256 87835d2e7bff391f8618516fafe36d3f3e73d91cd1ca75595f42460b54d241c0

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a3-cp311-cp311-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a3-cp311-cp311-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a3-cp311-cp311-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 904a43a8f5820f9fb6e1993d46eb407ceded59684430a1ecd119343e722043b8
MD5 2229dd2e7dcbcb85100e8055e08479dd
BLAKE2b-256 f4bd7e9afb8006e98ecec1056772417dfc647801baf45b62abd65543c7c4e551

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a3-cp311-cp311-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a3-cp311-cp311-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a3-cp311-cp311-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 66b81909ae5318211947ffde54ae5d333793545aceac436a22e0434293cd093b
MD5 350583bd7ce453f5dee7e3cb48683ab8
BLAKE2b-256 8a2a15d9aae27a081bc472f9599e9ce3d5d08572a9a538b34095eedd1b5a65f7

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a3-cp311-cp311-macosx_13_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a3-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a3-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 898d3c4393e92206a9396af1e8b911940efb4857cba6f2e996fa3a92341b332f
MD5 89fe16c2fc71626d7bc89ec3333ec5d3
BLAKE2b-256 4e70f338dca88fee91ccaae1b7fb3f92fc5164c683e831937818bdb6e97b5ad6

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a3-cp310-cp310-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a3-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a3-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 b5c461bba3ac2193fa2c32abbfe1c9c930ac0b8917e59a7046674389b825d810
MD5 b839c06843a221a829cdd416b8963b92
BLAKE2b-256 0eb6e171696798362f653e79563176a11ef1ddda88a50ca0d51852ba73895472

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a3-cp310-cp310-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a3-cp310-cp310-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a3-cp310-cp310-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 e8a72d41f1ca25108d9cb73d2568e76832da7d0a00a60541e9029e1b61c1c133
MD5 bea2f0396814d5543c248fc68a46a726
BLAKE2b-256 1877c0f6b28a08103b46e2d742bbcaf435699e36ec6b008068a0d90268ca82d5

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a3-cp310-cp310-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a3-cp310-cp310-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a3-cp310-cp310-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 49a5ad826c0f25984bf9daf0dfc04d960768dbe52065ba68dcb5a028dbeb9244
MD5 0f0bc0f0b2ff8180d5642fc7d7c4d66f
BLAKE2b-256 c9c26ba9f2a11fc43c360a9c7c3fdc4f3ca4dd508244bae79e98e42418b490a9

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a3-cp310-cp310-macosx_13_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page