Skip to main content

Arrow bindings for casacore

Project description

arcae implements a limited subset of functionality from the more mature python-casacore package. It bypasses some existing limitations in python-casacore to provide safe, multi-threaded access to CASA formats, thereby enabling export into newer cloud native formats such as Apache Arrow and Zarr.

Rationale

casacore and the python-casacore Python bindings provide access to the CASA Table Data System (CTDS) and Measurement Sets created within this system. The CTDS, as of casacore 3.5.0 is subject to the following limitations:

Resolving these concerns is potentially a major effort, involving invasive changes across the CTDS system.

In the time since the CTDS was developed, newer, open-source formats such as Apache Arrow and Zarr have been developed that are suitable for representing Radio Astronomy data.

  • The Apache Arrow project defines a programming language portable in-memory columnar storage format.

  • Translating CTDS data to Arrow is relatively simple, with some limitations mentioned below.

  • It’s easy to convert Arrow Tables between many different languages

  • Once in Apache Arrow format, it is easy to store data in modern, cloud-native disk formats such as parquet and Zarr.

  • Converting CASA Tables to Arrow in the C++ layer avoids the GIL

  • Access to non thread-safe CASA Tables is constrained to a ThreadPool containing a single thread

  • It also allows us to write astrometry routines in C++, potentially side-stepping thread-safety and GIL issues with the CASA Measures server.

Limitations

Arrow supports both 1D arrays and nested structures:

  1. Fixed shape multi-dimensional data (i.e. visibility data) is currently represented as nested FixedSizeListArrays .

  2. Variably-shaped multi-dimensional (i.e. subtable data) is currently represented as nested ListArrays.

  3. Complex values are represented as an extra FixedSizeListArray nesting of two floats.

  4. Currently, it is not trivially trivial (repetition intended here) to convert between the above and numpy via to_numpy calls on Arrow Arrays, but it is relatively trivial to reinterpret the underlying data buffers from either API. This is done transparently in getcol and putcol functions (see usage below).

Going forward, FixedShapeTensorArray and VariableShapeTensorArray will provide more ergonomic structures for representing multi-dimensional data. First class support for complex values in Apache Arrow will require implementing a C++ extension type within Arrow itself:

Some other edge cases have not yet been implemented, but could be with some thought.

  • Columns with unconstrained rank (ndim == -1) whose rows, in practice, have differing dimensions. Unconstrained rank columns whose rows actually have the same rank are catered for.

  • Not yet able to handle TpRecord columns. Probably simplest to convert these rows to json and store as a string.

  • Not yet able to handle TpQuantity columns. Possible to represent as a run-time parametric Arrow DataType.

Installation

Binary wheels are providing for Linux and MacOSX for both x86_64 and arm64 architectures

$ pip install arcae

Usage

Example usage with Arrow Tables:

import json
from pprint import pprint

import arcae
import pyarrow as pa
import pyarrow.parquet as pq

# Obtain (partial) Apache Arrow Table from a CASA Table
casa_table = arcae.table("/path/to/measurementset.ms")
arrow_table = casa_table.to_arrow()        # read entire table
arrow_table = casa_table.to_arrow(index=(slice(10, 20),)
assert isinstance(arrow_table, pa.Table)

# Print JSON-encoded Table and Column keywords
pprint(json.loads(arrow_table.schema.metadata[b"__arcae_metadata__"]))
pprint(json.loads(arrow_table.schema.field("DATA").metadata[b"__arcae_metadata__"]))

pq.write_table(arrow_table, "measurementset.parquet")

Some reading and writing functionality from python-casacore is replicated, with added support for some NumPy Advanced Indexing.

casa_table = arcae.table("/path/to/measurementset.ms", readonly=False)
# Get rows 10 and 2, and channels 16 to 32, and all correlations
data = casa_table.getcol("DATA", index=([10, 2], slice(16, 32), None))
# Write some modified data back
casa_table.putcol("DATA", data + 1*1j, index=([10, 2], slice(16, 32), None))

See the test cases for further use cases.

Exporting Measurement Sets to Arrow Parquet Datasets

Install the applications optional extra.

pip install arcae[applications]

Then, an export script is available:

$ arcae export /path/to/the.ms --nrow 50000
$ tree output.arrow/
output.arrow/
├── ANTENNA
   └── data0.parquet
├── DATA_DESCRIPTION
   └── data0.parquet
├── FEED
   └── data0.parquet
├── FIELD
   └── data0.parquet
├── MAIN
   └── FIELD_ID=0
       └── PROCESSOR_ID=0
           ├── DATA_DESC_ID=0
              ├── data0.parquet
              ├── data1.parquet
              ├── data2.parquet
              └── data3.parquet
           ├── DATA_DESC_ID=1
              ├── data0.parquet
              ├── data1.parquet
              ├── data2.parquet
              └── data3.parquet
           ├── DATA_DESC_ID=2
              ├── data0.parquet
              ├── data1.parquet
              ├── data2.parquet
              └── data3.parquet
           └── DATA_DESC_ID=3
               ├── data0.parquet
               ├── data1.parquet
               ├── data2.parquet
               └── data3.parquet
├── OBSERVATION
   └── data0.parquet

This data can be loaded into an Arrow Dataset:

>>> import pyarrow as pa
>>> import pyarrow.dataset as pad
>>> main_ds = pad.dataset("output.arrow/MAIN")
>>> spw_ds = pad.dataset("output.arrow/SPECTRAL_WINDOW")

Etymology

Noun: arca f (genitive arcae); first declension A chest, box, coffer, safe (safe place for storing items, or anything of a similar shape)

Pronounced: ar-ki.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arcae-0.3.0.tar.gz (120.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

arcae-0.3.0-cp313-cp313-manylinux_2_28_x86_64.whl (33.6 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

arcae-0.3.0-cp313-cp313-manylinux_2_28_aarch64.whl (31.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

arcae-0.3.0-cp313-cp313-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.13macOS 14.0+ ARM64

arcae-0.3.0-cp313-cp313-macosx_13_0_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.13macOS 13.0+ x86-64

arcae-0.3.0-cp312-cp312-manylinux_2_28_x86_64.whl (33.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

arcae-0.3.0-cp312-cp312-manylinux_2_28_aarch64.whl (31.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

arcae-0.3.0-cp312-cp312-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.12macOS 14.0+ ARM64

arcae-0.3.0-cp312-cp312-macosx_13_0_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.12macOS 13.0+ x86-64

arcae-0.3.0-cp311-cp311-manylinux_2_28_x86_64.whl (33.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

arcae-0.3.0-cp311-cp311-manylinux_2_28_aarch64.whl (30.9 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

arcae-0.3.0-cp311-cp311-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.11macOS 14.0+ ARM64

arcae-0.3.0-cp311-cp311-macosx_13_0_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.11macOS 13.0+ x86-64

arcae-0.3.0-cp310-cp310-manylinux_2_28_x86_64.whl (33.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

arcae-0.3.0-cp310-cp310-manylinux_2_28_aarch64.whl (30.9 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

arcae-0.3.0-cp310-cp310-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.10macOS 14.0+ ARM64

arcae-0.3.0-cp310-cp310-macosx_13_0_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.10macOS 13.0+ x86-64

File details

Details for the file arcae-0.3.0.tar.gz.

File metadata

  • Download URL: arcae-0.3.0.tar.gz
  • Upload date:
  • Size: 120.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for arcae-0.3.0.tar.gz
Algorithm Hash digest
SHA256 8f18b4dcced3b65070180ca0d881d5da971062288d1be1680e63711dbb754f97
MD5 273b514be3f699f2caae1a3c1e4ae3a2
BLAKE2b-256 88f1129da545065b152fa9e79a603f434c83cf5d917b776025aaa224824d20e9

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.0.tar.gz:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.0-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.3.0-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 5f08b167db3530524defe11413ccc625326154088316e3d3ab6e0559365f33c9
MD5 64eeb03cad8065e3b5443e48406d7e84
BLAKE2b-256 4789bc86d84f911c7ea80943116627fe23c3e0abeceea78417c6db49988bb3ac

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.0-cp313-cp313-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.0-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.3.0-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 ee69ae4cc458a5631819e1592de2f06ae4a58d4b70d08906306b60c29be597a6
MD5 7d94c4183a32bc139f47aa54d009de0b
BLAKE2b-256 81ae387f36a9d31f1f57577f5152077e7598515968843b1e8b3d5065688f25db

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.0-cp313-cp313-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.0-cp313-cp313-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.3.0-cp313-cp313-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 af2fdc17aa7f462c935bd11b13bde7a3ecfe5893176ab99f8ccceb2b4f278ab4
MD5 9e4c4075abe528f54edbd22efc0baf7e
BLAKE2b-256 b907a43c7ea044ff40898f7617f05366ccfccecd2198b686ebdc17672c722f23

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.0-cp313-cp313-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.0-cp313-cp313-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.3.0-cp313-cp313-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 70cd0fe0b968476c0ea11ef2f27877af81bc7c007f6d26709fa7f0ed1259763c
MD5 07c10fba0b4280e8eca11756d9a8acaa
BLAKE2b-256 6b77f004146b391d7283dec6df34f90434b643bd041e588bc71400dbafe94403

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.0-cp313-cp313-macosx_13_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.0-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.3.0-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 b347978f718c344265233ea9fbfae7382581b60cc0058c8bd28f19b8fce363cd
MD5 7838ac0e9882ab6966f458f250970b31
BLAKE2b-256 6d9a279a4d25d79378604ed184a07158377d3b8caf44fae70d94becfc97e668d

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.0-cp312-cp312-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.0-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.3.0-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 0f5ece1bc354c08c1eb52b71cc160dda41812229f7ef86578f248145d591d91d
MD5 fcb8c0f6a8a633a2aecc23329495fa69
BLAKE2b-256 27a8fabd4f8af1ef2d34a96688ad36de1d56d6a747dfe09c7e1d313fbbadabf4

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.0-cp312-cp312-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.0-cp312-cp312-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.3.0-cp312-cp312-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 14a5a5e94da384ac7796df05189c4871f9b597cc52a23cf78086cb09f2f0b3cb
MD5 aba7a61137e8acf77250e34dad7a8e9d
BLAKE2b-256 14004d0d225cd04b50bae6cabcb52a156d80f92fadbf6c2476ab9c22223c95d5

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.0-cp312-cp312-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.0-cp312-cp312-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.3.0-cp312-cp312-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 34db7de607379b99b8246735d42e04963400bea6205459740d8d4cd6992cdd5e
MD5 288dcaee30e153e938e899c5c701559b
BLAKE2b-256 354667bfe7eeac211439946ac52b21b168f88e51b2b9ad9e3803149315b0b5b0

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.0-cp312-cp312-macosx_13_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.3.0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 0b9aea8cda9c8f83d51e801dc20d47991e18c074ad99ee9ecb6ff7f80744cff5
MD5 fea9da5943b5daa0b0a38305d2f7922d
BLAKE2b-256 1fd30e892b0f6aa6587b4128e26de588b3313cc8daa38bb7879f1918f9c56731

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.0-cp311-cp311-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.0-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.3.0-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 91b63edb5555fae71c28d9c8443426ec67eee83578cb466cc82ad06f505a0c30
MD5 f26ba3bddfedd6b3d7e77b00ec73727f
BLAKE2b-256 411791e40f343abeb2de071a792fa3c49afd34c63abbbbcc6ac5276a3770f739

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.0-cp311-cp311-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.0-cp311-cp311-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.3.0-cp311-cp311-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 51746f72cc5c8d851fd01ad376a2cf5ac4878ca853822c79f4b57b7940717e86
MD5 0834ce812629ed0ae28fd48476dbb6d3
BLAKE2b-256 ac0238f579fe15afb0bd22cfc45b639c5f2ae2c8fcc6e0dd69b71fcccba4a112

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.0-cp311-cp311-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.0-cp311-cp311-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.3.0-cp311-cp311-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 7d9cb3bfd946fe44545c1e99e129ba2f549d35ed62d4fc7f669ead287b699fbb
MD5 4061550ea0a9252026f7299b4a934fd4
BLAKE2b-256 f354cc68f722b46742d8269a721e09b8124df9b36359db78fcab17370ed791cc

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.0-cp311-cp311-macosx_13_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.0-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.3.0-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 b6462909aa010db9b6ab29dcb6d691b5e013e161cd288561c8994c9db0d40f6e
MD5 4d5e1c1c03023d9e010be8b387b43284
BLAKE2b-256 b9baf7d31a370a83cf4d59d204c22a4fc81733c9f3a20939cbd4d8059ba7d36b

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.0-cp310-cp310-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.0-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.3.0-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 8d9e8e79212466c4477bf4dc78d918b6b76b6b52f404d2b7be6f7a6028d290d7
MD5 e37e30e3ec407068cd5f234e5efba7ab
BLAKE2b-256 8e92d6ae7f5b9d9038e6af76c5ae0de0da4686e614ab50e4110f158e372e4510

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.0-cp310-cp310-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.0-cp310-cp310-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.3.0-cp310-cp310-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 eaa9764cd360bd7982a5564df9ca4427749d6c84887663768a41a74cc9dfdaf5
MD5 ea9b88c83e8a03d46ba05063910d7a7a
BLAKE2b-256 875a324f5c66f1b0aecbbb84bb11a035f373d3fc5cc5e3bb3c7a81c0135926f9

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.0-cp310-cp310-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.0-cp310-cp310-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.3.0-cp310-cp310-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 e5b557f41b06d9bbb4a24b647573d5fc1a1d790923dc073d3b51e7bd696edb13
MD5 5159ba52402ad69416ca3893b67dee41
BLAKE2b-256 927fec285571cbfb100bafbaf6d6d90f55b4500170a5b10071a0c5090a5991ab

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.0-cp310-cp310-macosx_13_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page