Skip to main content

Arrow bindings for casacore

Project description

arcae implements a limited subset of functionality from the more mature python-casacore package. It bypasses some existing limitations in python-casacore to provide safe, multi-threaded access to CASA formats, thereby enabling export into newer cloud native formats such as Apache Arrow and Zarr.

Rationale

casacore and the python-casacore Python bindings provide access to the CASA Table Data System (CTDS) and Measurement Sets created within this system. The CTDS, as of casacore 3.5.0 is subject to the following limitations:

Resolving these concerns is potentially a major effort, involving invasive changes across the CTDS system.

In the time since the CTDS was developed, newer, open-source formats such as Apache Arrow and Zarr have been developed that are suitable for representing Radio Astronomy data.

  • The Apache Arrow project defines a programming language portable in-memory columnar storage format.

  • Translating CTDS data to Arrow is relatively simple, with some limitations mentioned below.

  • It’s easy to convert Arrow Tables between many different languages

  • Once in Apache Arrow format, it is easy to store data in modern, cloud-native disk formats such as parquet and Zarr.

  • Converting CASA Tables to Arrow in the C++ layer avoids the GIL

  • Access to non thread-safe CASA Tables is constrained to a ThreadPool containing a single thread

  • It also allows us to write astrometry routines in C++, potentially side-stepping thread-safety and GIL issues with the CASA Measures server.

Limitations

Arrow supports both 1D arrays and nested structures:

  1. Fixed shape multi-dimensional data (i.e. visibility data) is currently represented as nested FixedSizeListArrays .

  2. Variably-shaped multi-dimensional (i.e. subtable data) is currently represented as nested ListArrays.

  3. Complex values are represented as an extra FixedSizeListArray nesting of two floats.

  4. Currently, it is not trivially trivial (repetition intended here) to convert between the above and numpy via to_numpy calls on Arrow Arrays, but it is relatively trivial to reinterpret the underlying data buffers from either API. This is done transparently in getcol and putcol functions (see usage below).

Going forward, FixedShapeTensorArray and VariableShapeTensorArray will provide more ergonomic structures for representing multi-dimensional data. First class support for complex values in Apache Arrow will require implementing a C++ extension type within Arrow itself:

Some other edge cases have not yet been implemented, but could be with some thought.

  • Columns with unconstrained rank (ndim == -1) whose rows, in practice, have differing dimensions. Unconstrained rank columns whose rows actually have the same rank are catered for.

  • Not yet able to handle TpRecord columns. Probably simplest to convert these rows to json and store as a string.

  • Not yet able to handle TpQuantity columns. Possible to represent as a run-time parametric Arrow DataType.

Installation

Binary wheels are providing for Linux and MacOSX for both x86_64 and arm64 architectures

$ pip install arcae

Usage

Example usage with Arrow Tables:

import json
from pprint import pprint

import arcae
import pyarrow as pa
import pyarrow.parquet as pq

# Obtain (partial) Apache Arrow Table from a CASA Table
casa_table = arcae.table("/path/to/measurementset.ms")
arrow_table = casa_table.to_arrow()        # read entire table
arrow_table = casa_table.to_arrow(index=(slice(10, 20),)
assert isinstance(arrow_table, pa.Table)

# Print JSON-encoded Table and Column keywords
pprint(json.loads(arrow_table.schema.metadata[b"__arcae_metadata__"]))
pprint(json.loads(arrow_table.schema.field("DATA").metadata[b"__arcae_metadata__"]))

pq.write_table(arrow_table, "measurementset.parquet")

Some reading and writing functionality from python-casacore is replicated, with added support for some NumPy Advanced Indexing.

casa_table = arcae.table("/path/to/measurementset.ms", readonly=False)
# Get rows 10 and 2, and channels 16 to 32, and all correlations
data = casa_table.getcol("DATA", index=([10, 2], slice(16, 32), None))
# Write some modified data back
casa_table.putcol("DATA", data + 1*1j, index=([10, 2], slice(16, 32), None))

See the test cases for further use cases.

Exporting Measurement Sets to Arrow Parquet Datasets

Install the applications optional extra.

pip install arcae[applications]

Then, an export script is available:

$ arcae export /path/to/the.ms --nrow 50000
$ tree output.arrow/
output.arrow/
├── ANTENNA
   └── data0.parquet
├── DATA_DESCRIPTION
   └── data0.parquet
├── FEED
   └── data0.parquet
├── FIELD
   └── data0.parquet
├── MAIN
   └── FIELD_ID=0
       └── PROCESSOR_ID=0
           ├── DATA_DESC_ID=0
              ├── data0.parquet
              ├── data1.parquet
              ├── data2.parquet
              └── data3.parquet
           ├── DATA_DESC_ID=1
              ├── data0.parquet
              ├── data1.parquet
              ├── data2.parquet
              └── data3.parquet
           ├── DATA_DESC_ID=2
              ├── data0.parquet
              ├── data1.parquet
              ├── data2.parquet
              └── data3.parquet
           └── DATA_DESC_ID=3
               ├── data0.parquet
               ├── data1.parquet
               ├── data2.parquet
               └── data3.parquet
├── OBSERVATION
   └── data0.parquet

This data can be loaded into an Arrow Dataset:

>>> import pyarrow as pa
>>> import pyarrow.dataset as pad
>>> main_ds = pad.dataset("output.arrow/MAIN")
>>> spw_ds = pad.dataset("output.arrow/SPECTRAL_WINDOW")

Etymology

Noun: arca f (genitive arcae); first declension A chest, box, coffer, safe (safe place for storing items, or anything of a similar shape)

Pronounced: ar-ki.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arcae-0.3.2.tar.gz (124.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

arcae-0.3.2-cp313-cp313-manylinux_2_28_x86_64.whl (33.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

arcae-0.3.2-cp313-cp313-manylinux_2_28_aarch64.whl (31.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

arcae-0.3.2-cp313-cp313-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.13macOS 14.0+ ARM64

arcae-0.3.2-cp313-cp313-macosx_13_0_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.13macOS 13.0+ x86-64

arcae-0.3.2-cp312-cp312-manylinux_2_28_x86_64.whl (33.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

arcae-0.3.2-cp312-cp312-manylinux_2_28_aarch64.whl (31.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

arcae-0.3.2-cp312-cp312-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.12macOS 14.0+ ARM64

arcae-0.3.2-cp312-cp312-macosx_13_0_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.12macOS 13.0+ x86-64

arcae-0.3.2-cp311-cp311-manylinux_2_28_x86_64.whl (33.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

arcae-0.3.2-cp311-cp311-manylinux_2_28_aarch64.whl (31.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

arcae-0.3.2-cp311-cp311-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.11macOS 14.0+ ARM64

arcae-0.3.2-cp311-cp311-macosx_13_0_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.11macOS 13.0+ x86-64

arcae-0.3.2-cp310-cp310-manylinux_2_28_x86_64.whl (33.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

arcae-0.3.2-cp310-cp310-manylinux_2_28_aarch64.whl (31.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

arcae-0.3.2-cp310-cp310-macosx_14_0_arm64.whl (12.9 MB view details)

Uploaded CPython 3.10macOS 14.0+ ARM64

arcae-0.3.2-cp310-cp310-macosx_13_0_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.10macOS 13.0+ x86-64

File details

Details for the file arcae-0.3.2.tar.gz.

File metadata

  • Download URL: arcae-0.3.2.tar.gz
  • Upload date:
  • Size: 124.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for arcae-0.3.2.tar.gz
Algorithm Hash digest
SHA256 7aea52117e1df043c64962939ac4a98ffcd39af43fbb59fd9006f792353c6a3d
MD5 b11a00b8a3b314c829e82174c213095a
BLAKE2b-256 6ca12511e06a16d115cd64405b8966b255cc54de3c08f98d91c47f8aac80ff9e

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.2.tar.gz:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.2-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.3.2-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 8981f5b759cdb4646f2b74971b25ba2e67b1c1aa67d69bf9088a806abdd851d1
MD5 eadac33ba3592a88aa2e95f24331374a
BLAKE2b-256 73f3f139a01bb34cae94465ff9766d20e23f7b667dcd279c61415e3a508ed53e

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.2-cp313-cp313-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.2-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.3.2-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 3c2fc2c68ac3bfb146a0377fa8ad239ab8eb4edaa47013e8b97497f92cd7c325
MD5 7f7297b660b42d45bb814d527e6fe838
BLAKE2b-256 68fe90ec013a900b4174ff447a0e008ac8d11ee246e26537a8e8c29d871a17df

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.2-cp313-cp313-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.2-cp313-cp313-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.3.2-cp313-cp313-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 fa15e36aaf4e9328c95638a4c98917815ac905d55b3ce017fb13b15d59d15cbe
MD5 b74cfd5df4914b280010556028b95f69
BLAKE2b-256 3989b954467d6ae3c737902ade3021303752ce8793c131fd80a9439b9ac3f7e6

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.2-cp313-cp313-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.2-cp313-cp313-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.3.2-cp313-cp313-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 1ecce7b6bcd581c4cf6f17476388387ee8b56b4a556acba1a5f5dc37497a1e68
MD5 0b6ffcc1e0f990f3ef5a43698215d446
BLAKE2b-256 4d405c2881deddfb61e55f2307ff6084a50db511e7ed3ceed549523fccbe9df6

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.2-cp313-cp313-macosx_13_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.2-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.3.2-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a128cc7428bbba625df6c6e962cce2396b192accfb3d856674fac64ed1651629
MD5 cce5e61f90a796d4eed4615f16aad8d2
BLAKE2b-256 e743186bf91a4d28b5b4f4dcb23904a271aedaa7a3e30e92874a46221a90d31b

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.2-cp312-cp312-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.2-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.3.2-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 00880a766a5e86dc3b180e619e8213eb013553cfdc8d5f5d1ff4f6254615d3f6
MD5 04114aeb0dba302d5094e24282e3e545
BLAKE2b-256 0755028b869a7d205fddcb68830d268f074078b22d7eb435b37d9e42b37f8cd9

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.2-cp312-cp312-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.2-cp312-cp312-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.3.2-cp312-cp312-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 9ef42a3dc5877da1265e9eb5f14ecaec9c117a48d44352969c24c92da276c2de
MD5 62ebdead54ff1248fdc7de787e17a485
BLAKE2b-256 e0967e0541e36ad69a5e6377ae3ab6ab37e62e75f0f68aeb408c69d6429dd34c

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.2-cp312-cp312-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.2-cp312-cp312-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.3.2-cp312-cp312-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 f3095fac21ed94212653471d7f1f67f4b4481a6040b67fadc6c3fa647d1272b8
MD5 c60a2cd8a7924ae7698f501a5ad0978b
BLAKE2b-256 900e12fbb812a3bef682885afaf90fb15e7c30a775558c581492bf565dba485d

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.2-cp312-cp312-macosx_13_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.2-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.3.2-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 6206c87b76d9affbec43369487132d90d4355c9aa56988848119aecb411ddf9b
MD5 8122c590e6e213f2f53e3cc864fe5ae2
BLAKE2b-256 87c75784b00854edbe5f3ef10ede009ab58c7aef6b19a2b1cb0bac0360e86c0a

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.2-cp311-cp311-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.2-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.3.2-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 84ccbbbb4e59962eabeb76024a688a5cc561785ef7e27ecb743855e177038ed2
MD5 76ca44290c282468548def512146801a
BLAKE2b-256 479f163301993954700d9600ad9bc3f33ba5607303418702e34875b5424a4498

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.2-cp311-cp311-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.2-cp311-cp311-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.3.2-cp311-cp311-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 3ffee9db5bb1dcf1d56bc45abbe8da962bcf0fe0ad98e8e5973ef07990027775
MD5 70a1e92847d1689904e4e67ad68d47a7
BLAKE2b-256 049b80658b93249828a54e36e9f66922dec5ba372eb6db9c3e82f70dd976beb3

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.2-cp311-cp311-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.2-cp311-cp311-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.3.2-cp311-cp311-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 7fe946a767fe6f28d7c153ba02d5fd5e9a2d2c55100457ad81e444f7a5b2ad7e
MD5 94e4f2ed8dea0903e50ed1f982608cfa
BLAKE2b-256 264f042761d7ead6ac20727a8a4b1f551acb498331c6ef42c1af17822aaa0185

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.2-cp311-cp311-macosx_13_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.2-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.3.2-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 95faa9ae30c1fd3c1035260731836c7f55e8ccefde0b48fbcbbd6c5b267c0a56
MD5 d587334a4485cd698a4eaef55c99e889
BLAKE2b-256 e2780229a0b390d6e82d4a3cb05656e6165c6631f0515002affbf57f44b12c51

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.2-cp310-cp310-manylinux_2_28_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.2-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for arcae-0.3.2-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 3e39ccbc88de711baf42d279a0ef6e5688f2d67424ac182a806770cf071477c1
MD5 8ae13d7459ee60969b336485258051f3
BLAKE2b-256 f4b7976a1c051f92d9eed79184e5c11675088a43f8c197bad4c8ad1e7f937e3f

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.2-cp310-cp310-manylinux_2_28_aarch64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.2-cp310-cp310-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for arcae-0.3.2-cp310-cp310-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 3b911b6ff1812ce9230decb3aa9a72301e0d57062170d07431ffe540f508e701
MD5 f18b3785228d40e190d4b33a5adbed83
BLAKE2b-256 d70fc01dc7c8d5c95dd4bf29944c57c7bd1794543884a4ff49778f8548ec9644

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.2-cp310-cp310-macosx_14_0_arm64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arcae-0.3.2-cp310-cp310-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for arcae-0.3.2-cp310-cp310-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 a4bad67986c896d21b273f80e72572bd84941e9a2fbbd7c0405b316fe746ddfa
MD5 95968fd1f318e7cfa1d550874b7cc51c
BLAKE2b-256 430cf80b218995e40e5d23eee1e404e6f990700f9268a49c643be816bf20cf47

See more details on using hashes here.

Provenance

The following attestation bundles were made for arcae-0.3.2-cp310-cp310-macosx_13_0_x86_64.whl:

Publisher: ci.yml on ratt-ru/arcae

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page