Skip to main content

Arrow bindings for casacore

Project description

arcae implements a limited subset of functionality from the more mature python-casacore package. It bypasses some existing limitations in python-casacore to provide safe, multi-threaded access to CASA formats, thereby enabling export into newer cloud native formats such as Apache Arrow and Zarr.

Rationale

casacore and the python-casacore Python bindings provide access to the CASA Table Data System (CTDS) and Measurement Sets created within this system. The CTDS, as of casacore 3.5.0 is subject to the following limitations:

Resolving these concerns is potentially a major effort, involving invasive changes across the CTDS system.

In the time since the CTDS was developed, newer, open-source formats such as Apache Arrow and Zarr have been developed that are suitable for representing Radio Astronomy data.

  • The Apache Arrow project defines a programming language portable in-memory columnar storage format.

  • Translating CTDS data to Arrow is relatively simple, with some limitations mentioned below.

  • It’s easy to convert Arrow Tables between many different languages

  • Once in Apache Arrow format, it is easy to store data in modern, cloud-native disk formats such as parquet and Zarr.

  • Converting CASA Tables to Arrow in the C++ layer avoids the GIL

  • Access to non thread-safe CASA Tables is constrained to a ThreadPool containing a single thread

  • It also allows us to write astrometry routines in C++, potentially side-stepping thread-safety and GIL issues with the CASA Measures server.

Limitations

Arrow supports both 1D arrays and nested structures:

  1. Fixed shape multi-dimensional data (i.e. visibility data) is currently represented as nested FixedSizeListArrays .

  2. Variably-shaped multi-dimensional (i.e. subtable data) is currently represented as nested ListArrays.

  3. Complex values are represented as an extra FixedSizeListArray nesting of two floats.

  4. Currently, it is not trivially trivial (repetition intended here) to convert between the above and numpy via to_numpy calls on Arrow Arrays, but it is relatively trivial to reinterpret the underlying data buffers from either API. This is done transparently in getcol and putcol functions (see usage below).

Going forward, FixedShapeTensorArray and VariableShapeTensorArray will provide more ergonomic structures for representing multi-dimensional data. First class support for complex values in Apache Arrow will require implementing a C++ extension type within Arrow itself:

Some other edge cases have not yet been implemented, but could be with some thought.

  • Columns with unconstrained rank (ndim == -1) whose rows, in practice, have differing dimensions. Unconstrained rank columns whose rows actually have the same rank are catered for.

  • Not yet able to handle TpRecord columns. Probably simplest to convert these rows to json and store as a string.

  • Not yet able to handle TpQuantity columns. Possible to represent as a run-time parametric Arrow DataType.

Installation

Binary wheels are providing for Linux and MacOSX for both x86_64 and arm64 architectures

$ pip install arcae

Usage

Example usage with Arrow Tables:

import json
from pprint import pprint

import arcae
import pyarrow as pa
import pyarrow.parquet as pq

# Obtain (partial) Apache Arrow Table from a CASA Table
casa_table = arcae.table("/path/to/measurementset.ms")
arrow_table = casa_table.to_arrow()        # read entire table
arrow_table = casa_table.to_arrow(index=(slice(10, 20),)
assert isinstance(arrow_table, pa.Table)

# Print JSON-encoded Table and Column keywords
pprint(json.loads(arrow_table.schema.metadata[b"__arcae_metadata__"]))
pprint(json.loads(arrow_table.schema.field("DATA").metadata[b"__arcae_metadata__"]))

pq.write_table(arrow_table, "measurementset.parquet")

Some reading and writing functionality from python-casacore is replicated, with added support for some NumPy Advanced Indexing.

casa_table = arcae.table("/path/to/measurementset.ms", readonly=False)
# Get rows 10 and 2, and channels 16 to 32, and all correlations
data = casa_table.getcol("DATA", index=([10, 2], slice(16, 32), None))
# Write some modified data back
casa_table.putcol("DATA", data + 1*1j, index=([10, 2], slice(16, 32), None))

See the test cases for further use cases.

Multi-threaded read support

When opening a CASA table, arcae can open multiple instances of the table in separate threads and multiplex read operations over them. This mode of operation is intended to saturate the number of I/O requests in-flight by submitting read operations from multiple threads:

import concurrent.futures as cf

table = arcae.table("/path/to/measurementset.ms", ninstances=8, readonly=True)
with cf.ThreadPoolExecutor(max_workers=8) as pool:
  nrow = table.nrow()
  futures = []
  for start_row in range(0, nrow, 10_000):
  nrow = min(10_000, nrow - start_row)
    futures.append(pool.submit(table.getcol, "DATA", startrow=start_row, nrow=nrow))

  with cf.as_completed(futures):
    ...

Multi-threaded write support

arcae only supports writing when a single instance of the table is opened (ninstances=1). Writing to a table when multiple instances are opened is an unsafe operation and arcae will error if this is attempted. Future versions of arcae may support this. In the meantime, the support for this ability can be inspected via the arcae.safe_multithread_writes function:

assert not arcae.safe_multithreaded_writes()

table = arcae.table("/path/to/measurementset.ms", ninstances=8, readonly=False)
table.putcol("DATA", ...)  # Fails, ninstances > 1

Exporting Measurement Sets to Arrow Parquet Datasets

Install the applications optional extra.

pip install arcae[applications]

Then, an export script is available:

$ arcae export /path/to/the.ms --nrow 50000
$ tree output.arrow/
output.arrow/
├── ANTENNA
   └── data0.parquet
├── DATA_DESCRIPTION
   └── data0.parquet
├── FEED
   └── data0.parquet
├── FIELD
   └── data0.parquet
├── MAIN
   └── FIELD_ID=0
       └── PROCESSOR_ID=0
           ├── DATA_DESC_ID=0
              ├── data0.parquet
              ├── data1.parquet
              ├── data2.parquet
              └── data3.parquet
           ├── DATA_DESC_ID=1
              ├── data0.parquet
              ├── data1.parquet
              ├── data2.parquet
              └── data3.parquet
           ├── DATA_DESC_ID=2
              ├── data0.parquet
              ├── data1.parquet
              ├── data2.parquet
              └── data3.parquet
           └── DATA_DESC_ID=3
               ├── data0.parquet
               ├── data1.parquet
               ├── data2.parquet
               └── data3.parquet
├── OBSERVATION
   └── data0.parquet

This data can be loaded into an Arrow Dataset:

>>> import pyarrow as pa
>>> import pyarrow.dataset as pad
>>> main_ds = pad.dataset("output.arrow/MAIN")
>>> spw_ds = pad.dataset("output.arrow/SPECTRAL_WINDOW")

Etymology

Noun: arca f (genitive arcae); first declension A chest, box, coffer, safe (safe place for storing items, or anything of a similar shape)

Pronounced: ar-ki.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arcae-0.5.1.tar.gz (94.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

arcae-0.5.1-cp314-cp314-manylinux_2_28_x86_64.whl (33.8 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

arcae-0.5.1-cp314-cp314-manylinux_2_28_aarch64.whl (31.1 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ ARM64

arcae-0.5.1-cp314-cp314-macosx_15_0_x86_64.whl (14.8 MB view details)

Uploaded CPython 3.14macOS 15.0+ x86-64

arcae-0.5.1-cp314-cp314-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.14macOS 14.0+ ARM64

arcae-0.5.1-cp313-cp313-manylinux_2_28_x86_64.whl (33.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

arcae-0.5.1-cp313-cp313-manylinux_2_28_aarch64.whl (31.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

arcae-0.5.1-cp313-cp313-macosx_15_0_x86_64.whl (14.8 MB view details)

Uploaded CPython 3.13macOS 15.0+ x86-64

arcae-0.5.1-cp313-cp313-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.13macOS 14.0+ ARM64

arcae-0.5.1-cp312-cp312-manylinux_2_28_x86_64.whl (33.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

arcae-0.5.1-cp312-cp312-manylinux_2_28_aarch64.whl (31.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

arcae-0.5.1-cp312-cp312-macosx_15_0_x86_64.whl (14.8 MB view details)

Uploaded CPython 3.12macOS 15.0+ x86-64

arcae-0.5.1-cp312-cp312-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.12macOS 14.0+ ARM64

arcae-0.5.1-cp311-cp311-manylinux_2_28_x86_64.whl (33.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

arcae-0.5.1-cp311-cp311-manylinux_2_28_aarch64.whl (31.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

arcae-0.5.1-cp311-cp311-macosx_15_0_x86_64.whl (14.8 MB view details)

Uploaded CPython 3.11macOS 15.0+ x86-64

arcae-0.5.1-cp311-cp311-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.11macOS 14.0+ ARM64

File details

Details for the file arcae-0.5.1.tar.gz.

File metadata

  • Download URL: arcae-0.5.1.tar.gz
  • Upload date:
  • Size: 94.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for arcae-0.5.1.tar.gz
Algorithm Hash digest
SHA256 c17de4b697445fe74944c21ca86b1a175f666ce0410ddb6fa280bf0a1de4aa8b
MD5 f93c74141d49176a44a817b8e3bc3483
BLAKE2b-256 a767920a85df1100caa0653baf9b50846b16021e2946461acd0fdaa0f0aacb7d

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.1.tar.gz:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.1-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.5.1-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 c5802f8406412165aa757e8d9ba1d97d62cd0722fa742b15e208e5bd838c8536
MD5 fb7db05c6c04b868bfe7f9fb506ca78b
BLAKE2b-256 1823c06e7d67e7d0654ef70d0f63ec5e338bbfe18c67ff31380aefc3f5c5d641

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.1-cp314-cp314-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.1-cp314-cp314-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.5.1-cp314-cp314-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 0c35e234fc6fdb391153c622d1d7c10dd819dbd4958e8bd947376608c51a6ec1
MD5 910f166b2f02d6872914f03bd4a141b5
BLAKE2b-256 5acc7cb0875a78ade94f517b77b657b75e921c33c625e6a4336d8480e2b29ba2

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.1-cp314-cp314-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.1-cp314-cp314-macosx_15_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.5.1-cp314-cp314-macosx_15_0_x86_64.whl
Algorithm Hash digest
SHA256 04718dae4f2bf7316f6533379f85b0ab2714fc51ddecf53878cc7c6bb3bfc39e
MD5 d0dd975237684118b9516eb4b386cdd7
BLAKE2b-256 ba2b232fc4e3a0a72af5bfdbe4d3f99639b202433b4ff1a604c02602afbd2e80

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.1-cp314-cp314-macosx_15_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.1-cp314-cp314-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.5.1-cp314-cp314-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 5d8ba3f36b220ce66372f812cbc33eb6e6c541a6ac494e494b3a66ad1fcfea91
MD5 90cf96c63660726ae95cd91314f69d34
BLAKE2b-256 5b1e8a23fb6e929910c61d1597c87f6b7fac8c32b260deee3580912d7149b0b5

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.1-cp314-cp314-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.1-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.5.1-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 63aa09bc39bc3f69ca185e6f5fdf088a322ec03cff27c98ccb1ea7c173efe791
MD5 d74ccbe11f3bb44fd62bee45fff5167a
BLAKE2b-256 015ca41238fcb86bfd73a0d5a8598e58a3a0b3ee60151cd01ac61ec4b8432ef9

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.1-cp313-cp313-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.1-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.5.1-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 7070c60186a95bdd63ceef9b0d31851e203ce2f4ce894b2a71abe9f824df9f51
MD5 d677c76a8a4aeb7ded6a86a6cd7682f2
BLAKE2b-256 a07f6142272880abacad8c2bfea1fe1305c704b1dd2acc3f022f1116f9186c98

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.1-cp313-cp313-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.1-cp313-cp313-macosx_15_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.5.1-cp313-cp313-macosx_15_0_x86_64.whl
Algorithm Hash digest
SHA256 aeba24f0154955a1898210fbd092fd9f57842b967ba8a47da096e8d5e12ba904
MD5 78624724c354d92ff6b8cfc07d810de5
BLAKE2b-256 039f6ffa3222b2361244d838aae876386dddf40b1eb837b9c9ad19b4ecfc3745

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.1-cp313-cp313-macosx_15_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.1-cp313-cp313-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.5.1-cp313-cp313-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 0fe7362e402e14be34b0d75f3577655b6b7be9f8b458a29a7aae61b55ac0d6c3
MD5 48b8f05625cd5d271486adbb404a8213
BLAKE2b-256 3dc8a9eb0d706e4545daea03d41df03776ac299ca649bb2a7ca502bbb5608c4a

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.1-cp313-cp313-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.1-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.5.1-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 2fc9c00b51b695ca8090a75d9717c5fdb01f0f3ad3d0cb16d13c2ff2c9cd4875
MD5 b72275c059658186570dc8a29ea9c109
BLAKE2b-256 918db5cd24be966c58dbdfc33ddd489ecf99c8da92ee65fba1a8b0c4844a6997

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.1-cp312-cp312-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.1-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.5.1-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 e1013aaf4b7b5c9c2cde419a182e4862e4e3d0fad0ef97e5adb68dc3dd4bf920
MD5 e29e5e474f0a8d3486364dca36277d29
BLAKE2b-256 db8799437df4b805c1cd10c2fd357b0d0b663013bb98aaca5d8949376382810e

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.1-cp312-cp312-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.1-cp312-cp312-macosx_15_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.5.1-cp312-cp312-macosx_15_0_x86_64.whl
Algorithm Hash digest
SHA256 78016be0009befa6ca8a1c484b1afc34df344b39ad590db0ee38e14460df781e
MD5 2af23cdfe5a98dd0ac688789fed14148
BLAKE2b-256 b327d45167a9537374c47adccb270329013677f40ec35955785f90c6f0f8b66c

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.1-cp312-cp312-macosx_15_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.1-cp312-cp312-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.5.1-cp312-cp312-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 44a503bbd9ae45189d5e04e991864cd08c5703b72d80b635ed5407b642814235
MD5 1973098992a4c78f363e72b6b36d68b9
BLAKE2b-256 63e0fe7e12cfc9dfa144bf3d6bc0b2ea32b901bcf2083bc5f4c3cbe746dc8e22

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.1-cp312-cp312-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.1-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.5.1-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 8a510a09101ba71ac389f0fb491c2f7404fc72e19539bdb29ee70fd7e9250bbe
MD5 99b29ea55c8bc8f47ed2f1e9a3b4bba3
BLAKE2b-256 4b6305343c4f228cd84a4983a3f8a3fe7348b2414b9bf6d6168258e57abf6053

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.1-cp311-cp311-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.1-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.5.1-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 21902f433e8cf55ad666a8b6aaacec18ee1dcadec646e66022a8e7dd512e1b15
MD5 9eb9e2bb5ebecc726e2775f8a58723a7
BLAKE2b-256 4bcd9df589d8744bd7abfdbf1d4c838ba3b387a9f9241420435992c37a8ba7ad

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.1-cp311-cp311-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.1-cp311-cp311-macosx_15_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.5.1-cp311-cp311-macosx_15_0_x86_64.whl
Algorithm Hash digest
SHA256 0e216f238b41dc1f1358e4745afdae4946cb8e628a32bd116a7b9a65f5e6e111
MD5 e2106850fae29c5274d14efa9e4a08f1
BLAKE2b-256 1fa28eaad9348bd039cb1d9dd994cec4be05c015b54b1dbf39eba437b9ed291c

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.1-cp311-cp311-macosx_15_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.5.1-cp311-cp311-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.5.1-cp311-cp311-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 7461832abdd353e0bf86530cc9456585329280169e18fc47767b28d4ec4b7b1f
MD5 5bc5fe03eeefac5780851f911d9ea7d4
BLAKE2b-256 056e92f6b6be1e6695c0413f13598838863977bd8ae98268ca4b8bda57891929

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.5.1-cp311-cp311-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page