Skip to main content

Arrow bindings for casacore

Project description

arcae implements a limited subset of functionality from the more mature python-casacore package. It bypasses some existing limitations in python-casacore to provide safe, multi-threaded access to CASA formats, thereby enabling export into newer cloud native formats such as Apache Arrow and Zarr.

Rationale

casacore and the python-casacore Python bindings provide access to the CASA Table Data System (CTDS) and Measurement Sets created within this system. The CTDS, as of casacore 3.5.0 is subject to the following limitations:

Resolving these concerns is potentially a major effort, involving invasive changes across the CTDS system.

In the time since the CTDS was developed, newer, open-source formats such as Apache Arrow and Zarr have been developed that are suitable for representing Radio Astronomy data.

  • The Apache Arrow project defines a programming language portable in-memory columnar storage format.

  • Translating CTDS data to Arrow is relatively simple, with some limitations mentioned below.

  • It’s easy to convert Arrow Tables between many different languages

  • Once in Apache Arrow format, it is easy to store data in modern, cloud-native disk formats such as parquet and Zarr.

  • Converting CASA Tables to Arrow in the C++ layer avoids the GIL

  • Access to non thread-safe CASA Tables is constrained to a ThreadPool containing a single thread

  • It also allows us to write astrometry routines in C++, potentially side-stepping thread-safety and GIL issues with the CASA Measures server.

Limitations

Arrow supports both 1D arrays and nested structures:

  1. Fixed shape multi-dimensional data (i.e. visibility data) is currently represented as nested FixedSizeListArrays .

  2. Variably-shaped multi-dimensional (i.e. subtable data) is currently represented as nested ListArrays.

  3. Complex values are represented as an extra FixedSizeListArray nesting of two floats.

  4. Currently, it is not trivially trivial (repetition intended here) to convert between the above and numpy via to_numpy calls on Arrow Arrays, but it is relatively trivial to reinterpret the underlying data buffers from either API. This is done transparently in getcol and putcol functions (see usage below).

Going forward, FixedShapeTensorArray and VariableShapeTensorArray will provide more ergonomic structures for representing multi-dimensional data. First class support for complex values in Apache Arrow will require implementing a C++ extension type within Arrow itself:

Some other edge cases have not yet been implemented, but could be with some thought.

  • Columns with unconstrained rank (ndim == -1) whose rows, in practice, have differing dimensions. Unconstrained rank columns whose rows actually have the same rank are catered for.

  • Not yet able to handle TpRecord columns. Probably simplest to convert these rows to json and store as a string.

  • Not yet able to handle TpQuantity columns. Possible to represent as a run-time parametric Arrow DataType.

Installation

Binary wheels are providing for Linux and MacOSX for both x86_64 and arm64 architectures

$ pip install arcae

Usage

Example usage with Arrow Tables:

import json
from pprint import pprint

import arcae
import pyarrow as pa
import pyarrow.parquet as pq

# Obtain (partial) Apache Arrow Table from a CASA Table
casa_table = arcae.table("/path/to/measurementset.ms")
arrow_table = casa_table.to_arrow()        # read entire table
arrow_table = casa_table.to_arrow(index=(slice(10, 20),)
assert isinstance(arrow_table, pa.Table)

# Print JSON-encoded Table and Column keywords
pprint(json.loads(arrow_table.schema.metadata[b"__arcae_metadata__"]))
pprint(json.loads(arrow_table.schema.field("DATA").metadata[b"__arcae_metadata__"]))

pq.write_table(arrow_table, "measurementset.parquet")

Some reading and writing functionality from python-casacore is replicated, with added support for some NumPy Advanced Indexing.

casa_table = arcae.table("/path/to/measurementset.ms", readonly=False)
# Get rows 10 and 2, and channels 16 to 32, and all correlations
data = casa_table.getcol("DATA", index=([10, 2], slice(16, 32), None))
# Write some modified data back
casa_table.putcol("DATA", data + 1*1j, index=([10, 2], slice(16, 32), None))

See the test cases for further use cases.

Exporting Measurement Sets to Arrow Parquet Datasets

Install the applications optional extra.

pip install arcae[applications]

Then, an export script is available:

$ arcae export /path/to/the.ms --nrow 50000
$ tree output.arrow/
output.arrow/
├── ANTENNA
   └── data0.parquet
├── DATA_DESCRIPTION
   └── data0.parquet
├── FEED
   └── data0.parquet
├── FIELD
   └── data0.parquet
├── MAIN
   └── FIELD_ID=0
       └── PROCESSOR_ID=0
           ├── DATA_DESC_ID=0
              ├── data0.parquet
              ├── data1.parquet
              ├── data2.parquet
              └── data3.parquet
           ├── DATA_DESC_ID=1
              ├── data0.parquet
              ├── data1.parquet
              ├── data2.parquet
              └── data3.parquet
           ├── DATA_DESC_ID=2
              ├── data0.parquet
              ├── data1.parquet
              ├── data2.parquet
              └── data3.parquet
           └── DATA_DESC_ID=3
               ├── data0.parquet
               ├── data1.parquet
               ├── data2.parquet
               └── data3.parquet
├── OBSERVATION
   └── data0.parquet

This data can be loaded into an Arrow Dataset:

>>> import pyarrow as pa
>>> import pyarrow.dataset as pad
>>> main_ds = pad.dataset("output.arrow/MAIN")
>>> spw_ds = pad.dataset("output.arrow/SPECTRAL_WINDOW")

Etymology

Noun: arca f (genitive arcae); first declension A chest, box, coffer, safe (safe place for storing items, or anything of a similar shape)

Pronounced: ar-ki.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arcae-0.5.0.tar.gz (92.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

arcae-0.5.0-cp314-cp314-manylinux_2_28_x86_64.whl (33.8 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

arcae-0.5.0-cp314-cp314-manylinux_2_28_aarch64.whl (31.1 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ ARM64

arcae-0.5.0-cp314-cp314-macosx_15_0_x86_64.whl (14.8 MB view details)

Uploaded CPython 3.14macOS 15.0+ x86-64

arcae-0.5.0-cp314-cp314-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.14macOS 14.0+ ARM64

arcae-0.5.0-cp313-cp313-manylinux_2_28_x86_64.whl (33.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

arcae-0.5.0-cp313-cp313-manylinux_2_28_aarch64.whl (31.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

arcae-0.5.0-cp313-cp313-macosx_15_0_x86_64.whl (14.8 MB view details)

Uploaded CPython 3.13macOS 15.0+ x86-64

arcae-0.5.0-cp313-cp313-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.13macOS 14.0+ ARM64

arcae-0.5.0-cp312-cp312-manylinux_2_28_x86_64.whl (33.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

arcae-0.5.0-cp312-cp312-manylinux_2_28_aarch64.whl (31.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

arcae-0.5.0-cp312-cp312-macosx_15_0_x86_64.whl (14.8 MB view details)

Uploaded CPython 3.12macOS 15.0+ x86-64

arcae-0.5.0-cp312-cp312-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.12macOS 14.0+ ARM64

arcae-0.5.0-cp311-cp311-manylinux_2_28_x86_64.whl (33.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

arcae-0.5.0-cp311-cp311-manylinux_2_28_aarch64.whl (31.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

arcae-0.5.0-cp311-cp311-macosx_15_0_x86_64.whl (14.8 MB view details)

Uploaded CPython 3.11macOS 15.0+ x86-64

arcae-0.5.0-cp311-cp311-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.11macOS 14.0+ ARM64

File details

Details for the file arcae-0.5.0.tar.gz.

File metadata

  • Download URL: arcae-0.5.0.tar.gz
  • Upload date:
  • Size: 92.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for arcae-0.5.0.tar.gz
Algorithm Hash digest
SHA256 d4d6350b62614998a4a7cfd9a343a9bb19d3313552458f69c2fbb9a56b690b51
MD5 80b13365047baeb1eaec387eecbd967f
BLAKE2b-256 4d47e5d06f99202fed65120d8c1d41baf1accde6bd54749efb2636874eb82409

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.0.tar.gz:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.0-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.5.0-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 061519e84083fd6fac285a99a74c3adc4a3fb30fc31423ff3320d8b55c6c6740
MD5 ff6b15dcfc74d2c4a766639ff9096a2e
BLAKE2b-256 9ebc7ad3ec7f9a351a7c50c492b9d0e306e9a3b5e78f40d4a4b821fd21f9a2cb

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.0-cp314-cp314-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.0-cp314-cp314-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.5.0-cp314-cp314-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 0070527884923f88458395e9e3daf33d29abdfdf42288478d1a3b373ea8ea15b
MD5 bb021552822a09fdadc6281b6a6c3e90
BLAKE2b-256 fec15ac05cb2603d3f4f4f290861efb6bfb7bc7070d3cb7cc714241a658bb3d5

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.0-cp314-cp314-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.0-cp314-cp314-macosx_15_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.5.0-cp314-cp314-macosx_15_0_x86_64.whl
Algorithm Hash digest
SHA256 0386d2a73c9c625d29dee35df86f85e9bae00f899d8aaa6a39e893cc35d629a1
MD5 e05ff8820a2745a688d9ad2cc1b6b840
BLAKE2b-256 afadbf6d5ee71ce1c62c19dea14d710281915340017e68eb63643b575911d8b1

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.0-cp314-cp314-macosx_15_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.0-cp314-cp314-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.5.0-cp314-cp314-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 5c7146e8a7f160020adfd2b789467a8f077c3a5945a7ea16a56702761f8ceed9
MD5 57dced1d9ad57e491858bdbd853f5006
BLAKE2b-256 e396d1fe03cde463b3614bddf15b87d4728b184bf7a42757bf7696be1695b17c

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.0-cp314-cp314-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.0-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.5.0-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 7bea7cbc3eacab40de084a5dbb11f36500f1dfe4251b45318db75dcdf5057663
MD5 a3b8a6bf54d1601e0ebf738fe0ba06be
BLAKE2b-256 fc57550ed94cea6239b98addbc6b70b6af9e73fd92a28ab5bb918eb8559feae2

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.0-cp313-cp313-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.0-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.5.0-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 f4c66c267fc387448a384ef35314d9c90e21b5dd070ed54ac5052e000f8bb89b
MD5 6887523eb57685e2c0fbcbc9719b848f
BLAKE2b-256 d560678166d3ff4b14596cd8ca2b1a4efaf1d33eeac8daf640624197d1d5cb63

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.0-cp313-cp313-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.0-cp313-cp313-macosx_15_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.5.0-cp313-cp313-macosx_15_0_x86_64.whl
Algorithm Hash digest
SHA256 057d846a6ff27f05fe159471cd7fa6e86016725ec6eac0cb5123f630d278d264
MD5 715dcef2b18274bea0bf5d107a206a16
BLAKE2b-256 dda253b101b43e9c890a08c4b7506619101fb5f4d2dedeb4dbc8b6574b8bbff7

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.0-cp313-cp313-macosx_15_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.0-cp313-cp313-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.5.0-cp313-cp313-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 e0d9e8138f7914732854e92763b40ec1ae61368a3731c92c699c826fce30b148
MD5 c86fa050baf8d1de2367dfcf7500ab86
BLAKE2b-256 35a42adb08262255a5fcfe44bedc68cbd6de7e4b7a946095128dc4490f2cc117

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.0-cp313-cp313-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.0-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.5.0-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 75aa87e65938830900ececb34e90aeb491b9a0930543a262c2df5984d693332f
MD5 e8a939f30b3763afcac123d7ab683142
BLAKE2b-256 f5fa88ee6322d87de5ffa855d06a699fc53fa06b38619d9c6d785eff14433e54

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.0-cp312-cp312-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.0-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.5.0-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 3a9d22e38cbade9f010ee489e0446a15220990cbf176bc4c4b9c4f254c9ea647
MD5 33326d3bb4e9d23dbd9dc6dfb3029410
BLAKE2b-256 a7fb8956d01b93744c4247c0c5c80ec7a556e9bed87713083448255a87c2dd29

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.0-cp312-cp312-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.0-cp312-cp312-macosx_15_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.5.0-cp312-cp312-macosx_15_0_x86_64.whl
Algorithm Hash digest
SHA256 38188db4831a48365d2eebf53b5b2f6eb8b324741822ab56c4f68636a955f1e7
MD5 401df9e8591c7f45703ae57883559810
BLAKE2b-256 73367f8ac6a0c4ab4c5bda1bd418a2ef1e8628f2a3dadd26a30a1d2542812730

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.0-cp312-cp312-macosx_15_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.0-cp312-cp312-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.5.0-cp312-cp312-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 587ded70390d46229b9aeb40c1a5553a8c3898ee2162122e4ff3808a7bee2180
MD5 5a3493f4f49182ff7a9bf427e9283ba3
BLAKE2b-256 be68d3266d57054f043b030f1a725378123ce8d6b813c46eaffbc3bad52c2184

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.0-cp312-cp312-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.5.0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 fde81253dd696666cb7f0c08deff1ba63b192d601b52a516bfd7cc44c564a291
MD5 c6d69b0aeb0af9a6adb79c12cf7e4d48
BLAKE2b-256 2869035468cf0585a26cf9e6be1c22c7e5b62e639782d5815ae0f37221c9e29e

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.0-cp311-cp311-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.0-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.5.0-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 f659bd215f0485039489e94708406d1bdcb9225ee7f9e16e384eb3ae0306ad24
MD5 a47cd3331715694ba6edc9247e36508c
BLAKE2b-256 ba3843d4aede37d2d24992ae368013d1383d40b3e9dbb77113778498f048a8e2

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.0-cp311-cp311-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.0-cp311-cp311-macosx_15_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.5.0-cp311-cp311-macosx_15_0_x86_64.whl
Algorithm Hash digest
SHA256 21293db785d2fd0cf99fb078c1a1c867838ff9b48fd82ae8e8dc6c5810eb3186
MD5 d0e38ac206f40dc56c1c7291f25ad1a9
BLAKE2b-256 ed24256e86080eeaf964b8198f0461ff178665fd919a5b8538085fc3f903edea

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.0-cp311-cp311-macosx_15_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.0-cp311-cp311-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.5.0-cp311-cp311-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 3cd893deb27a57b9cf667ae0e0da733d571b0a4a27b5d20be176cca22a03eaef
MD5 c59b9d7df452d6f66524eb30d90aec32
BLAKE2b-256 165c91ddd88383b2a5b0dee0ff6627d7366f592d26785d255ccf0b67ff68aba0

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.0-cp311-cp311-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page