Skip to main content

Arrow bindings for casacore

Project description

arcae implements a limited subset of functionality from the more mature python-casacore package. It bypasses some existing limitations in python-casacore to provide safe, multi-threaded access to CASA formats, thereby enabling export into newer cloud native formats such as Apache Arrow and Zarr.

Rationale

casacore and the python-casacore Python bindings provide access to the CASA Table Data System (CTDS) and Measurement Sets created within this system. The CTDS, as of casacore 3.5.0 is subject to the following limitations:

Resolving these concerns is potentially a major effort, involving invasive changes across the CTDS system.

In the time since the CTDS was developed, newer, open-source formats such as Apache Arrow and Zarr have been developed that are suitable for representing Radio Astronomy data.

  • The Apache Arrow project defines a programming language portable in-memory columnar storage format.

  • Translating CTDS data to Arrow is relatively simple, with some limitations mentioned below.

  • It’s easy to convert Arrow Tables between many different languages

  • Once in Apache Arrow format, it is easy to store data in modern, cloud-native disk formats such as parquet and Zarr.

  • Converting CASA Tables to Arrow in the C++ layer avoids the GIL

  • Access to non thread-safe CASA Tables is constrained to a ThreadPool containing a single thread

  • It also allows us to write astrometry routines in C++, potentially side-stepping thread-safety and GIL issues with the CASA Measures server.

Limitations

Arrow supports both 1D arrays and nested structures:

  1. Fixed shape multi-dimensional data (i.e. visibility data) is currently represented as nested FixedSizeListArrays .

  2. Variably-shaped multi-dimensional (i.e. subtable data) is currently represented as nested ListArrays.

  3. Complex values are represented as an extra FixedSizeListArray nesting of two floats.

  4. Currently, it is not trivially trivial (repetition intended here) to convert between the above and numpy via to_numpy calls on Arrow Arrays, but it is relatively trivial to reinterpret the underlying data buffers from either API. This is done transparently in getcol and putcol functions (see usage below).

Going forward, FixedShapeTensorArray and VariableShapeTensorArray will provide more ergonomic structures for representing multi-dimensional data. First class support for complex values in Apache Arrow will require implementing a C++ extension type within Arrow itself:

Some other edge cases have not yet been implemented, but could be with some thought.

  • Columns with unconstrained rank (ndim == -1) whose rows, in practice, have differing dimensions. Unconstrained rank columns whose rows actually have the same rank are catered for.

  • Not yet able to handle TpRecord columns. Probably simplest to convert these rows to json and store as a string.

  • Not yet able to handle TpQuantity columns. Possible to represent as a run-time parametric Arrow DataType.

Installation

Binary wheels are providing for Linux and MacOSX for both x86_64 and arm64 architectures

$ pip install arcae

Usage

Example usage with Arrow Tables:

import json
from pprint import pprint

import arcae
import pyarrow as pa
import pyarrow.parquet as pq

# Obtain (partial) Apache Arrow Table from a CASA Table
casa_table = arcae.table("/path/to/measurementset.ms")
arrow_table = casa_table.to_arrow()        # read entire table
arrow_table = casa_table.to_arrow(index=(slice(10, 20),)
assert isinstance(arrow_table, pa.Table)

# Print JSON-encoded Table and Column keywords
pprint(json.loads(arrow_table.schema.metadata[b"__arcae_metadata__"]))
pprint(json.loads(arrow_table.schema.field("DATA").metadata[b"__arcae_metadata__"]))

pq.write_table(arrow_table, "measurementset.parquet")

Some reading and writing functionality from python-casacore is replicated, with added support for some NumPy Advanced Indexing.

casa_table = arcae.table("/path/to/measurementset.ms", readonly=False)
# Get rows 10 and 2, and channels 16 to 32, and all correlations
data = casa_table.getcol("DATA", index=([10, 2], slice(16, 32), None))
# Write some modified data back
casa_table.putcol("DATA", data + 1*1j, index=([10, 2], slice(16, 32), None))

See the test cases for further use cases.

Exporting Measurement Sets to Arrow Parquet Datasets

Install the applications optional extra.

pip install arcae[applications]

Then, an export script is available:

$ arcae export /path/to/the.ms --nrow 50000
$ tree output.arrow/
output.arrow/
├── ANTENNA
   └── data0.parquet
├── DATA_DESCRIPTION
   └── data0.parquet
├── FEED
   └── data0.parquet
├── FIELD
   └── data0.parquet
├── MAIN
   └── FIELD_ID=0
       └── PROCESSOR_ID=0
           ├── DATA_DESC_ID=0
              ├── data0.parquet
              ├── data1.parquet
              ├── data2.parquet
              └── data3.parquet
           ├── DATA_DESC_ID=1
              ├── data0.parquet
              ├── data1.parquet
              ├── data2.parquet
              └── data3.parquet
           ├── DATA_DESC_ID=2
              ├── data0.parquet
              ├── data1.parquet
              ├── data2.parquet
              └── data3.parquet
           └── DATA_DESC_ID=3
               ├── data0.parquet
               ├── data1.parquet
               ├── data2.parquet
               └── data3.parquet
├── OBSERVATION
   └── data0.parquet

This data can be loaded into an Arrow Dataset:

>>> import pyarrow as pa
>>> import pyarrow.dataset as pad
>>> main_ds = pad.dataset("output.arrow/MAIN")
>>> spw_ds = pad.dataset("output.arrow/SPECTRAL_WINDOW")

Etymology

Noun: arca f (genitive arcae); first declension A chest, box, coffer, safe (safe place for storing items, or anything of a similar shape)

Pronounced: ar-ki.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arcae-0.4.0a2.tar.gz (128.4 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

arcae-0.4.0a2-cp313-cp313-manylinux_2_28_x86_64.whl (34.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

arcae-0.4.0a2-cp313-cp313-manylinux_2_28_aarch64.whl (31.3 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

arcae-0.4.0a2-cp313-cp313-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.13macOS 14.0+ ARM64

arcae-0.4.0a2-cp313-cp313-macosx_13_0_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.13macOS 13.0+ x86-64

arcae-0.4.0a2-cp312-cp312-manylinux_2_28_x86_64.whl (34.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

arcae-0.4.0a2-cp312-cp312-manylinux_2_28_aarch64.whl (31.3 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

arcae-0.4.0a2-cp312-cp312-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.12macOS 14.0+ ARM64

arcae-0.4.0a2-cp312-cp312-macosx_13_0_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.12macOS 13.0+ x86-64

arcae-0.4.0a2-cp311-cp311-manylinux_2_28_x86_64.whl (33.9 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

arcae-0.4.0a2-cp311-cp311-manylinux_2_28_aarch64.whl (31.2 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

arcae-0.4.0a2-cp311-cp311-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.11macOS 14.0+ ARM64

arcae-0.4.0a2-cp311-cp311-macosx_13_0_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.11macOS 13.0+ x86-64

arcae-0.4.0a2-cp310-cp310-manylinux_2_28_x86_64.whl (33.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

arcae-0.4.0a2-cp310-cp310-manylinux_2_28_aarch64.whl (31.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

arcae-0.4.0a2-cp310-cp310-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.10macOS 14.0+ ARM64

arcae-0.4.0a2-cp310-cp310-macosx_13_0_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.10macOS 13.0+ x86-64

File details

Details for the file arcae-0.4.0a2.tar.gz.

File metadata

  • Download URL: arcae-0.4.0a2.tar.gz
  • Upload date:
  • Size: 128.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for arcae-0.4.0a2.tar.gz
Algorithm Hash digest
SHA256 8d99128d531d236fb141c7ff189b3fe38a0836bd3e96ee087b553a2086d83576
MD5 8fcdc289bbbe9a0b0ed033f69d5ed965
BLAKE2b-256 433276c1ea8581f1b0f22b7afdfcfc17ef2286d94602c12dc155afe26beff86f

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a2.tar.gz:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a2-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a2-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 71e46d9861fb52831e4bf6764459f296438bc8e2090ef3c43c1da9d5edfc1cc3
MD5 4a8f03456293d6386748eeb222e45448
BLAKE2b-256 19b50e708cb1c8aef1e6db7e306304d0cf348c21e534ef3dfe92b5071158f1ff

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a2-cp313-cp313-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a2-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a2-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 cad40c0bcb6b0229424af7f320f0edc135ecdce06c1cf2995d80607769549e0f
MD5 6cc53dcc0481deaf9830f57b3e9b95de
BLAKE2b-256 19c6bb658e9d3c9d6698c97ac12300286ebc2ba0d7fab9c638869899e2139244

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a2-cp313-cp313-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a2-cp313-cp313-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a2-cp313-cp313-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 5158df454743888228b068626974c025884f269eefb005d5f19db8b5b816ce35
MD5 44910687c70184612e37f066b7d6186e
BLAKE2b-256 66c52089774ced7a9038f6b40eb956b0a21a6c6141a61ab56456550aaa9b4647

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a2-cp313-cp313-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a2-cp313-cp313-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a2-cp313-cp313-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 d503a2969c2a05eb4e4b43d4fe9031683cfe6dfbb5408005c65365351b2dc2ed
MD5 d0eecf5acc2b4a5b21789177570dc0a0
BLAKE2b-256 894017af5af4615d7cb5ec301709caf4fd52e4d8c1efe2ef49904c2dc02b47bf

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a2-cp313-cp313-macosx_13_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a2-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a2-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 1258032b2d460705291a3492f030c42bd1a8ef567c20b406f9bafb365b66eb44
MD5 cc0d0698b6ecfca00f25fb1720e1bdfd
BLAKE2b-256 487b585539ed6d3206aa8cfdaf6e97449e9380d4cad7d9e8743dd450c4d6d23c

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a2-cp312-cp312-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a2-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a2-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 fbc20439b47d27c0b8245245e3a8e7c07104e88af8ba1f54de5ad40301e4981c
MD5 31155eeee43c68e4c5fbc620a92eaa0d
BLAKE2b-256 a4d60715c25b6d403a3237063523fc49e0e8f9776f60f594c96a4a99e1153b96

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a2-cp312-cp312-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a2-cp312-cp312-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a2-cp312-cp312-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 61b0666c2efaaaad14f620a1eb7ed0521a91eea2c671cad2abc42f8f286cb281
MD5 ac76c9a2e356d6e3c9bd956fb41b5054
BLAKE2b-256 0bd687a2d0e339cd03ec0770b0fc459965ab022ff7083f21984a089e516bced8

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a2-cp312-cp312-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a2-cp312-cp312-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a2-cp312-cp312-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 cfdbec0ad7c0b0f8037bdc486bedef0e39005b99172d6302d99cbc0ae9258c2b
MD5 558a894ef137a1e3d47610b2bd1d7e22
BLAKE2b-256 1e37a6207f4559f390ffb5892ab34c9dc7d33dbde615cb38cf38d1f1d8b9d8c4

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a2-cp312-cp312-macosx_13_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a2-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a2-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 431cfcf16279ff844b3470cc1b8475a9eca02a35ac41261d755166f8b6f8505d
MD5 ea3b1fcf21deb960c76adfffc599b87d
BLAKE2b-256 c52374161f2ee5f9d93933b34ead4ec1ad09b8266bb3633a783c3080a3586c2e

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a2-cp311-cp311-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a2-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a2-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 81ee88b3e580d2534eec3c490df93e989916b3b35654c65af78be952c1f3e19a
MD5 79bede014809a35d9ea2207ed2303fdb
BLAKE2b-256 f80e4f73ff92e0780b9e6171357cd43b0b5fdb78f386f6bc033a8fd655c09d65

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a2-cp311-cp311-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a2-cp311-cp311-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a2-cp311-cp311-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 87312ef9e4c8884e6ee039da8e5e62effaa129c55da453f297f91c20dba396ee
MD5 7742d74520f012f2d934b6dfa483a67d
BLAKE2b-256 6d1c60d34aff1259d310e279ccfad7e77181f205378e1bf2126300cb3237d318

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a2-cp311-cp311-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a2-cp311-cp311-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a2-cp311-cp311-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 82c54b0d31a7251b160334af2b10a4f8d47c943b8a92e0834a769bbb2c074a3b
MD5 a0194551110e8ecada66eab78584ea2a
BLAKE2b-256 38060abe24498921d7097bd72410712794c742cf57c4d2ceeca3f58743bb95be

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a2-cp311-cp311-macosx_13_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a2-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a2-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 2decf7f793633a1490fc75ef08faebf8fa1746b76ddd07b851f7efb4d0b3e5a5
MD5 1124ed06da91a83bf23dac31e2594e0c
BLAKE2b-256 b0ddc43b621fa503f6075162b34f8d55446b2105cfe033e11ac33d2e924183c0

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a2-cp310-cp310-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a2-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a2-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 9ba07ccdd4e33348e0590dcaa71951a958fe070416cf61b7b5f84240420d8e34
MD5 e22e5cb30ee18182b55d343ecf0cbd91
BLAKE2b-256 31b3799f3700599ad55d615e006284dfd555a11731484b793610afd0764fb6c3

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a2-cp310-cp310-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a2-cp310-cp310-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a2-cp310-cp310-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 fa4ad0313523d54175657780d6a3eb607d5584941c5b56714c0b52ca46abd045
MD5 7dfc25ff09097753d915970c7a99b42c
BLAKE2b-256 69a9b1c1e7e4066681c9548e298ccaab1eb9a6a2815d4e9af7a961c627349a82

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a2-cp310-cp310-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.4.0a2-cp310-cp310-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.4.0a2-cp310-cp310-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 1ca01174dce7cfd73f288d51b804b28720760a8ea2851c4de1e5801b069203d9
MD5 a662923d97d8bd4cfdd18827495336a1
BLAKE2b-256 29a430d8760163327e941c6ba43b05f1d40adbad3587d3e8d00f92ce1ae53eae

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.4.0a2-cp310-cp310-macosx_13_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page