Skip to main content

Arrow bindings for casacore

Project description

arcae implements a limited subset of functionality from the more mature python-casacore package. It bypasses some existing limitations in python-casacore to provide safe, multi-threaded access to CASA formats, thereby enabling export into newer cloud native formats such as Apache Arrow and Zarr.

Rationale

casacore and the python-casacore Python bindings provide access to the CASA Table Data System (CTDS) and Measurement Sets created within this system. The CTDS, as of casacore 3.5.0 is subject to the following limitations:

Resolving these concerns is potentially a major effort, involving invasive changes across the CTDS system.

In the time since the CTDS was developed, newer, open-source formats such as Apache Arrow and Zarr have been developed that are suitable for representing Radio Astronomy data.

  • The Apache Arrow project defines a programming language portable in-memory columnar storage format.

  • Translating CTDS data to Arrow is relatively simple, with some limitations mentioned below.

  • It’s easy to convert Arrow Tables between many different languages

  • Once in Apache Arrow format, it is easy to store data in modern, cloud-native disk formats such as parquet and Zarr.

  • Converting CASA Tables to Arrow in the C++ layer avoids the GIL

  • Access to non thread-safe CASA Tables is constrained to a ThreadPool containing a single thread

  • It also allows us to write astrometry routines in C++, potentially side-stepping thread-safety and GIL issues with the CASA Measures server.

Limitations

Arrow supports both 1D arrays and nested structures:

  1. Fixed shape multi-dimensional data (i.e. visibility data) is currently represented as nested FixedSizeListArrays .

  2. Variably-shaped multi-dimensional (i.e. subtable data) is currently represented as nested ListArrays.

  3. Complex values are represented as an extra FixedSizeListArray nesting of two floats.

  4. Currently, it is not trivially trivial (repetition intended here) to convert between the above and numpy via to_numpy calls on Arrow Arrays, but it is relatively trivial to reinterpret the underlying data buffers from either API. This is done transparently in getcol and putcol functions (see usage below).

Going forward, FixedShapeTensorArray and VariableShapeTensorArray will provide more ergonomic structures for representing multi-dimensional data. First class support for complex values in Apache Arrow will require implementing a C++ extension type within Arrow itself:

Some other edge cases have not yet been implemented, but could be with some thought.

  • Columns with unconstrained rank (ndim == -1) whose rows, in practice, have differing dimensions. Unconstrained rank columns whose rows actually have the same rank are catered for.

  • Not yet able to handle TpRecord columns. Probably simplest to convert these rows to json and store as a string.

  • Not yet able to handle TpQuantity columns. Possible to represent as a run-time parametric Arrow DataType.

Installation

Binary wheels are providing for Linux and MacOSX for both x86_64 and arm64 architectures

$ pip install arcae

Usage

Example usage with Arrow Tables:

import json
from pprint import pprint

import arcae
import pyarrow as pa
import pyarrow.parquet as pq

# Obtain (partial) Apache Arrow Table from a CASA Table
casa_table = arcae.table("/path/to/measurementset.ms")
arrow_table = casa_table.to_arrow()        # read entire table
arrow_table = casa_table.to_arrow(index=(slice(10, 20),)
assert isinstance(arrow_table, pa.Table)

# Print JSON-encoded Table and Column keywords
pprint(json.loads(arrow_table.schema.metadata[b"__arcae_metadata__"]))
pprint(json.loads(arrow_table.schema.field("DATA").metadata[b"__arcae_metadata__"]))

pq.write_table(arrow_table, "measurementset.parquet")

Some reading and writing functionality from python-casacore is replicated, with added support for some NumPy Advanced Indexing.

casa_table = arcae.table("/path/to/measurementset.ms", readonly=False)
# Get rows 10 and 2, and channels 16 to 32, and all correlations
data = casa_table.getcol("DATA", index=([10, 2], slice(16, 32), None))
# Write some modified data back
casa_table.putcol("DATA", data + 1*1j, index=([10, 2], slice(16, 32), None))

See the test cases for further use cases.

Exporting Measurement Sets to Arrow Parquet Datasets

Install the applications optional extra.

pip install arcae[applications]

Then, an export script is available:

$ arcae export /path/to/the.ms --nrow 50000
$ tree output.arrow/
output.arrow/
├── ANTENNA
   └── data0.parquet
├── DATA_DESCRIPTION
   └── data0.parquet
├── FEED
   └── data0.parquet
├── FIELD
   └── data0.parquet
├── MAIN
   └── FIELD_ID=0
       └── PROCESSOR_ID=0
           ├── DATA_DESC_ID=0
              ├── data0.parquet
              ├── data1.parquet
              ├── data2.parquet
              └── data3.parquet
           ├── DATA_DESC_ID=1
              ├── data0.parquet
              ├── data1.parquet
              ├── data2.parquet
              └── data3.parquet
           ├── DATA_DESC_ID=2
              ├── data0.parquet
              ├── data1.parquet
              ├── data2.parquet
              └── data3.parquet
           └── DATA_DESC_ID=3
               ├── data0.parquet
               ├── data1.parquet
               ├── data2.parquet
               └── data3.parquet
├── OBSERVATION
   └── data0.parquet

This data can be loaded into an Arrow Dataset:

>>> import pyarrow as pa
>>> import pyarrow.dataset as pad
>>> main_ds = pad.dataset("output.arrow/MAIN")
>>> spw_ds = pad.dataset("output.arrow/SPECTRAL_WINDOW")

Etymology

Noun: arca f (genitive arcae); first declension A chest, box, coffer, safe (safe place for storing items, or anything of a similar shape)

Pronounced: ar-ki.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arcae-0.3.3.tar.gz (125.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

arcae-0.3.3-cp314-cp314-manylinux_2_28_x86_64.whl (33.8 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

arcae-0.3.3-cp314-cp314-manylinux_2_28_aarch64.whl (31.2 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ ARM64

arcae-0.3.3-cp314-cp314-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.14macOS 14.0+ ARM64

arcae-0.3.3-cp314-cp314-macosx_13_0_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.14macOS 13.0+ x86-64

arcae-0.3.3-cp313-cp313-manylinux_2_28_x86_64.whl (33.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

arcae-0.3.3-cp313-cp313-manylinux_2_28_aarch64.whl (31.2 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

arcae-0.3.3-cp313-cp313-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.13macOS 14.0+ ARM64

arcae-0.3.3-cp313-cp313-macosx_13_0_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.13macOS 13.0+ x86-64

arcae-0.3.3-cp312-cp312-manylinux_2_28_x86_64.whl (33.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

arcae-0.3.3-cp312-cp312-manylinux_2_28_aarch64.whl (31.2 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

arcae-0.3.3-cp312-cp312-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.12macOS 14.0+ ARM64

arcae-0.3.3-cp312-cp312-macosx_13_0_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.12macOS 13.0+ x86-64

arcae-0.3.3-cp311-cp311-manylinux_2_28_x86_64.whl (33.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

arcae-0.3.3-cp311-cp311-manylinux_2_28_aarch64.whl (31.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

arcae-0.3.3-cp311-cp311-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.11macOS 14.0+ ARM64

arcae-0.3.3-cp311-cp311-macosx_13_0_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.11macOS 13.0+ x86-64

File details

Details for the file arcae-0.3.3.tar.gz.

File metadata

  • Download URL: arcae-0.3.3.tar.gz
  • Upload date:
  • Size: 125.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for arcae-0.3.3.tar.gz
Algorithm Hash digest
SHA256 5ca302166daea545d1704811d3281d93789859f41f8f8303ff8688d779a0475b
MD5 09468b78029ea8212ea8698a1f8c92df
BLAKE2b-256 85f5941b4e292893bd475498afb1e54448f909b4fa459776501e78c098bd8c92

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.3.tar.gz:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.3-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.3.3-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 bd024866a92e5e17a3e902a738d7003a6e01dfe512da525a137087e2f7192d0d
MD5 4ab636aaa281fe7a84a0cbf34e95bc3e
BLAKE2b-256 f46882c9b7e311071a13318381504f0ce365d27a87e11484ab868611b3eda628

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.3-cp314-cp314-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.3-cp314-cp314-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.3.3-cp314-cp314-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 8ac57ef878912c0cf2f6e9e1ab8f8a612cb08596fcc35e27c768022d3e71d24b
MD5 88ce4878d1e0f83804c009ebd3dd7873
BLAKE2b-256 a59040faf3ce38617a3099744f61f3ba6d53c5d731cff369ec0356e4b5bcbef5

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.3-cp314-cp314-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.3-cp314-cp314-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.3.3-cp314-cp314-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 ef415052f29a85efd8dbdc7929ed5c54d8ebaf02abf1721bbe70fa524c482b3d
MD5 abcbc15b4f854f9b39ad8bc7595991f9
BLAKE2b-256 e99625ea2cb2c4d3c5861c0b42e193fbfe3afeac4903a1e735a83b273be1cbcc

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.3-cp314-cp314-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.3-cp314-cp314-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.3.3-cp314-cp314-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 2fd7d5e97be06e54e82b41e81f63f7b5ecf94237c19d6396c92ef1acacad6383
MD5 0e5a94a1106dc6f3d32ffb9c98cba183
BLAKE2b-256 82d106917106acd846c3d28f8268233a87e93210e9583b6471f4a66962e6ebbd

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.3-cp314-cp314-macosx_13_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.3-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.3.3-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ef7a0df69de98fdb8b29f34b3217a706cfa5ad63ca1b2b43473656cff89491e4
MD5 7cae26f1865bb08d833f622b9633b50a
BLAKE2b-256 44d76092f5d8a9e7923e87bd7765e794029ed84c4bb5f6ed915186a7724fa210

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.3-cp313-cp313-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.3-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.3.3-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 7e1f3b1a485afb523f9daaca3076efe716880cb7d8d5073473fdd1f62e4009aa
MD5 9a3793c158958066a8e4f8df314daa08
BLAKE2b-256 ff8708ec4b1e80ae186dd4fb048ae7e37729ecf6f5ba47f12d01273ed2a72c4b

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.3-cp313-cp313-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.3-cp313-cp313-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.3.3-cp313-cp313-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 a15b9de14e8392f6610d0b40eda61709d276a10adfc4e017621fbe0a3573f6d6
MD5 8c7eeebdabf93669a2a646132a7a6c93
BLAKE2b-256 95ee90cd7e62ea6de735e1348ffd51f2c7e21158d440108004e96c7ba2352841

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.3-cp313-cp313-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.3-cp313-cp313-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.3.3-cp313-cp313-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 200f0e78e4754296ade49fb62d57d756a53ce1339f6e9d7609301778d57df39e
MD5 9ef4c50417ec880d66039bdd1a228f14
BLAKE2b-256 224ec174cd9f688f6d10bef66dbe83a3a4e73af2bbfb1e21855ff1c48bd9f6fc

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.3-cp313-cp313-macosx_13_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.3-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.3.3-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 3b0ddafe03b16b9e773da3ca2e2708f95d5399befcf3b078e57bb48d16b68bb6
MD5 bc5c83f564a1c02c59ba352124ddea22
BLAKE2b-256 a6b0eaa830ed61cac85bcce139d8f884f32e691f58e50f9dea489ad20fcd6a99

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.3-cp312-cp312-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.3-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.3.3-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 a942d810b6a76b87d9a44e0594f41a37e026df7fbdf883959d7d8d06dfbc6cdd
MD5 d4a66568c8661b2b932284335579e9a5
BLAKE2b-256 e84c64669eddef40bc6e8921017759d75f979bead310f3b722b87cbb32e2d383

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.3-cp312-cp312-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.3-cp312-cp312-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.3.3-cp312-cp312-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 66f28c6a7ecd063bbeb6349562f6062ccf3d0ce9b82375c0fd126cb0f3a29d6f
MD5 fd3d1e65b4ee438550532efe58809489
BLAKE2b-256 ce28a4835b6d088b78c1004e5003eada923fd4dd96173c03088965cf1a692004

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.3-cp312-cp312-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.3-cp312-cp312-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.3.3-cp312-cp312-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 9141b8e5b29f18da2dcae6e88209f2f26be468091be132b658b403d3ea85ae28
MD5 f74a9fcf044ca20cb89b3740db747c95
BLAKE2b-256 9cfb65a8d2a39ace99912e90ebbd604d7bc230c0b14f7219a89e6f83bd5bf721

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.3-cp312-cp312-macosx_13_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.3-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.3.3-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 21a9d8b50bf65ebc87ce57018eb5b0b0c118571deb69137fffd7097e53aace21
MD5 48be3ddb9368c1299755d043f742851f
BLAKE2b-256 6e61b7a0e784062ed0e4e6c3d2be5e6c56b1a7206fbf2450dd383bc4fc6cfd87

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.3-cp311-cp311-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.3-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.3.3-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 2c8533bce0a16869dc37db9ed44d6632135652f326742b6e8e13ee78ecc741e1
MD5 9ce7f3bbb45e6751953b1c8a6bf677b7
BLAKE2b-256 81e1a67b9dfc98f24368c6c7d7aa2eb244457ee2cb9fcb289faec68d6c82f2dc

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.3-cp311-cp311-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.3-cp311-cp311-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.3.3-cp311-cp311-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 8efc166de933acc3bdb6f9dc997c2da2a4d8f12feeda6121700d1e08bcfe409f
MD5 aa0f6965a94cd7426a895e3f6f03a53b
BLAKE2b-256 62f2fef53001a61557cef5fe177a8b558e64a17f83265e749b94c73724bc9811

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.3-cp311-cp311-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.3-cp311-cp311-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.3.3-cp311-cp311-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 f1326acea04a41dd7d6499720637f9deb495f65a1122b5b778d68cac4d492136
MD5 cb70c960675eea48e3219b2b08053a56
BLAKE2b-256 668646973d3f58b3e34d1cefb3241d9411519fdff9bba7ec6f2a54951363c1fc

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.3-cp311-cp311-macosx_13_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page