An ALMA Simulation package for a more civilized era.
Project description
ALMASim
ALMASim is a library-first Python environment for simulating ALMA observations, exploring ALMA metadata, downloading science products, and building ML-ready radio/mm-wave datasets.
It provides reusable services in src/almasim that can be driven by CLI scripts, Jupyter notebooks, a FastAPI backend, or direct Python code — all through the same staged API.
Table of Contents
- Key Capabilities
- Architecture
- Installation
- Quick Start
- Staged Simulation API
- Skymodels
- Compute Backends
- Metadata and Downloads
- Backend Service
- Examples
- Documentation
- Contributing
- License
Key Capabilities
Simulation
- Build clean sky cubes from point, Gaussian, extended, molecular-cloud, diffuse, Galaxy Zoo, and Hubble-100 source models
- Simulate single-pointing ALMA interferometric observations with multi-configuration support (12m, 7m, TP)
- PWV-aware per-channel noise model
- Additive astrophysical background sky — faint dusty galaxies, diffuse emission, or combined
- Optional serendipitous source injection
- Iterative CLEAN-style deconvolution with resumable state
- TP+INT feather-style image combination
Data Products
- Dirty cube, dirty visibilities, beam cube, UV mask cube, U/V coordinate cubes
- Interferometric, total-power, and combined TP+INT image cubes
- ML-ready HDF5 shards (clean cube + dirty cube + dirty visibilities + UV mask + metadata)
- Native MeasurementSet (
.ms) export via CASA tools or python-casacore
Metadata and Archive
- Query ALMA observations via TAP with rich inclusion/exclusion filters
- Normalise TAP columns into stable application fields
- Resolve DataLink products, download ALMA data products with parallel support
- Unpack raw ASDMs into MeasurementSets
- Apply delivered calibration to produce calibrated science MSs
Compute
- Synchronous, local multiprocess, Dask, Slurm, and Kubernetes backends
- Backend-agnostic simulation service layer
Architecture
src/almasim/ ← installable library (pip install almasim)
services/
simulation.py ← staged pipeline entry points
interferometry/ ← UV sampling, baselines, noise, TP
imaging/ ← deconvolution, TP+INT combination
metadata/ ← TAP queries, normalisation
products/ ← MS export, HDF5 shards, cube export
compute/ ← backend abstraction
archive/ ← ASDM unpack, calibration apply
astro/ ← spectral lines, redshift, parameters
skymodels/ ← source model implementations
backend/ ← FastAPI service (Docker: ghcr.io/…/almasim-backend)
frontend/ ← Svelte UI (requires Docker Compose)
examples/ ← CLI scripts and Jupyter notebooks
The library layer owns all domain logic. The backend is a thin adapter over library services. CLI scripts and notebooks call the same staged services directly.
Installation
Library only (cross-platform)
pip install almasim
With CASA tools (Linux x86-64 only)
casatools and casatasks wheels are Linux-only. Install the optional [casa] extra on a supported Linux system:
pip install "almasim[casa]"
The [casa] extra enables:
- Native MeasurementSet export via
casatools - ASDM-to-MS conversion via
casatasks.importasdm - Calibration application via
casatasks.applycal
Without [casa], all simulation, imaging, metadata, and download features still work. The MS export path falls back to python-casacore if available:
pip install "almasim[ms-casacore]"
From source (development)
git clone https://github.com/MicheleDelliVeneri/ALMASim.git
cd ALMASim
pip install uv
uv sync --group dev
Backend service (Docker Compose)
The FastAPI backend and Svelte frontend require Docker Compose:
git clone https://github.com/MicheleDelliVeneri/ALMASim.git
cd ALMASim
docker compose up
The backend image is available pre-built from GHCR:
docker pull ghcr.io/michelledelliveneri/almasim-backend:latest
Quick Start
Query ALMA metadata
from almasim.services.metadata.tap.service import query_by_science_type, InclusionFilters
df = query_by_science_type(
include=InclusionFilters(science_keyword=["Galaxies"], band=[6])
)
print(df[["ALMA_source_name", "Band", "spatial_resolution"]].head())
Run a simulation from a metadata row
from almasim import SimulationParams, run_simulation
from pathlib import Path
params = SimulationParams.from_metadata_row(
row, # pandas Series from a metadata query
idx=0,
main_dir=Path("src/almasim"),
output_dir=Path("output"),
project_name="my_project",
)
result = run_simulation(params)
Use the staged API
from almasim import (
SimulationParams,
generate_clean_cube,
simulate_observation,
image_products,
export_results,
)
params = SimulationParams.from_metadata_row(row, idx=0, ...)
cube_result = generate_clean_cube(params)
obs_result = simulate_observation(params, cube_result)
img_result = image_products(params, obs_result)
export_results(params, cube_result, obs_result, img_result)
Staged Simulation API
The pipeline is split into four composable stages:
| Stage | Function | What it does |
|---|---|---|
| 1 | generate_clean_cube() |
Build sky cube from skymodel, apply background |
| 2 | simulate_observation() |
Run interferometric + TP simulation, return dirty products |
| 3 | image_products() |
Deconvolve, combine INT+TP, build image cubes |
| 4 | export_results() |
Write cubes, ML shards, parameter summaries to disk |
run_simulation() orchestrates all four in sequence.
write_ml_dataset_shard() exports an HDF5 shard (clean cube + dirty cube + dirty visibilities + UV mask + metadata) independently of the main export path.
estimate_simulation_footprint() returns resolved pixel count, channel count, cell size, beam size, and raw output size in GiB — useful for pre-run capacity checks.
Full reference: Simulation docs
Skymodels
| Source type | Description |
|---|---|
point |
Point source — PSF and CLEAN validation |
gaussian |
2-D Gaussian — compact extended source |
extended |
TNG-backed realistic extended emission |
galaxy-zoo |
Galaxy Zoo image morphology prior |
hubble-100 |
Hubble Top-100 image morphology prior |
molecular |
Molecular cloud structured emission |
diffuse |
Correlated diffuse emission field |
All skymodels accept explicit source_offset_x_arcsec / source_offset_y_arcsec to shift the science target from phase center.
Additive background sky (independent of the main source):
| Mode | Effect |
|---|---|
blank_field_dsfg |
Faint dusty star-forming galaxies |
dusty_diffuse |
Correlated low-spatial-frequency dusty background |
combined |
Both of the above |
Full reference: Skymodels docs
Compute Backends
Select via SimulationParams.compute_backend:
| Backend | Use case |
|---|---|
sync |
Notebooks, examples, debugging |
local |
Local CPU parallelism |
dask |
Distributed execution, cluster scheduling |
slurm |
HPC job submission |
kubernetes |
Cluster-native environments |
Full reference: Compute docs
Metadata and Downloads
Query metadata via TAP
from almasim.services.metadata.tap.service import (
query_by_science_type,
InclusionFilters,
ExclusionFilters,
)
df = query_by_science_type(
include=InclusionFilters(
science_keyword=["Galaxies"],
band=[6, 7],
public_only=True,
science_only=True,
),
exclude=ExclusionFilters(solar=True),
)
Download products
from almasim.services.download import resolve_products, run_download_job
products = resolve_products(df["member_ous_uid"].tolist())
run_download_job(products, destination=Path("downloads"), extract_tar=True)
Full reference: Metadata docs · Downloads docs
Backend Service
The FastAPI backend exposes library services over HTTP and drives the Svelte frontend.
| Endpoint group | Purpose |
|---|---|
/api/v1/metadata |
TAP queries and metadata management |
/api/v1/simulation |
Simulation job submission and status |
/api/v1/download |
Product resolution and download jobs |
/api/v1/imaging |
Deconvolution and combination products |
/api/v1/visualizer |
Output browsing and product inspection |
/health |
Health check |
/docs |
Interactive OpenAPI docs (Swagger UI) |
Start locally for development:
cd backend
uv run uvicorn app.main:app --reload --port 8000
Full reference: Frontend docs
Examples
All examples use the sync compute backend and require no running scheduler.
| Script | Description |
|---|---|
examples/query_metadata_cli.py |
Query TAP, export metadata and product CSVs |
examples/download_products_cli.py |
Resolve and download ALMA products |
examples/archive_ms_cli.py |
Unpack ASDMs and apply calibration |
examples/staged_pipeline_cli.py |
Full pipeline: query → simulate → ML shard |
examples/imaging_cli.py |
Synthetic imaging + iterative deconvolution |
# Query metadata for Band 6 galaxy observations
python examples/query_metadata_cli.py \
--science-keyword Galaxies --band 6 \
--save-csv examples/output/metadata.csv
# Run a staged simulation from the first metadata row
python examples/staged_pipeline_cli.py \
--metadata-csv examples/output/metadata.csv \
--row-idx 0 --project-name demo \
--ml-shard-path examples/output/demo.h5
# Iterative deconvolution demo
python examples/imaging_cli.py \
--output-dir examples/output/imaging --cycles 180 --gain 0.12
Notebook equivalents: staged_pipeline_notebook.ipynb · query_metadata_notebook.ipynb · download_products_notebook.ipynb
End-to-end archive pipeline (Marimo)
examples/e2e_archive_pipeline.py is a reactive Marimo notebook that covers the full archive workflow interactively: query ALMA metadata → resolve DataLink products → download → unpack ASDMs → apply calibration.
# Install dev dependencies (includes marimo)
uv sync --group dev
# Interactive editing mode — cells re-run automatically as you edit
marimo edit examples/e2e_archive_pipeline.py
# Read-only app mode — run the pipeline step-by-step via the UI
marimo run examples/e2e_archive_pipeline.py
Steps 4 (unpack) and 5 (calibrate) require CASA tools (Linux x86-64 only):
pip install "almasim[casa]"
The notebook saves query filter presets as .query.json files so they can be reloaded across sessions.
Documentation
Full documentation: michedelliveneri.github.io/ALMASim
| Section | Topics |
|---|---|
| Quick Start | Installation, first simulation |
| Simulation | Staged API, SimulationParams, outputs |
| Interferometry | UV sampling, baselines, multi-config |
| Noise | PWV-aware noise model |
| Background Sky | Additive astrophysical background |
| Skymodels | Source models reference |
| Imaging | Deconvolution, TP+INT combination |
| Metadata | TAP queries, filters |
| Downloads | Product download workflow |
| Compute Backends | Sync, Dask, Slurm, Kubernetes |
| Frontend | Svelte UI workflows |
Build docs locally:
uv sync --group dev
uv run sphinx-build -b html docs/source docs/build/html
Contributing
git clone https://github.com/MicheleDelliVeneri/ALMASim.git
cd ALMASim
uv sync --group dev
uv run pytest --ignore=illustris_python
uv run ruff check .
uv run ruff format .
A release is published automatically when a version tag is pushed:
# 1. Bump version in pyproject.toml and src/almasim/__version__.py
# 2. Commit and tag
git tag v2.1.11
git push origin v2.1.11
The release pipeline then:
- Validates that the tag matches
pyproject.toml - Runs the full lint + test suite
- Publishes wheel and sdist to PyPI via OIDC trusted publisher
- Creates a GitHub Release with auto-generated changelog and attached artifacts
- Builds and pushes the backend Docker image to GHCR
One-time PyPI setup: register a trusted publisher on PyPI with owner
MicheleDelliVeneri, repoALMASim, workflowrelease.yml, environmentpypi.
License
ALMASim is released under the GNU General Public License v3.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file almasim-2.1.11.tar.gz.
File metadata
- Download URL: almasim-2.1.11.tar.gz
- Upload date:
- Size: 28.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5a923049876f83e7986f2e7d814deba0a7e7c33473ff0a9c793877f3fca941e8
|
|
| MD5 |
dd9c3c61b4d50d7484b54fa4fbdfa434
|
|
| BLAKE2b-256 |
5e5afeb758da84cb89cbc3755e2b672053a453999188064c3e6f4e16c0b36141
|
Provenance
The following attestation bundles were made for almasim-2.1.11.tar.gz:
Publisher:
release.yml on MicheleDelliVeneri/ALMASim
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
almasim-2.1.11.tar.gz -
Subject digest:
5a923049876f83e7986f2e7d814deba0a7e7c33473ff0a9c793877f3fca941e8 - Sigstore transparency entry: 1382675904
- Sigstore integration time:
-
Permalink:
MicheleDelliVeneri/ALMASim@01ec01e819f09ae05ac8d0a5f5d4e31778ad11d8 -
Branch / Tag:
refs/tags/v2.1.11 - Owner: https://github.com/MicheleDelliVeneri
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@01ec01e819f09ae05ac8d0a5f5d4e31778ad11d8 -
Trigger Event:
push
-
Statement type:
File details
Details for the file almasim-2.1.11-py3-none-any.whl.
File metadata
- Download URL: almasim-2.1.11-py3-none-any.whl
- Upload date:
- Size: 274.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6c46e6aaba0f7f2242ff32b40b9acd7d76b2b04d2d775a9ee206b1085544803c
|
|
| MD5 |
d537a8d6476fb78e2fe966df26c59e69
|
|
| BLAKE2b-256 |
34e2eb2c724584d25f5c15141b4f2a944f57fe790d41a76827a041f33a779a6d
|
Provenance
The following attestation bundles were made for almasim-2.1.11-py3-none-any.whl:
Publisher:
release.yml on MicheleDelliVeneri/ALMASim
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
almasim-2.1.11-py3-none-any.whl -
Subject digest:
6c46e6aaba0f7f2242ff32b40b9acd7d76b2b04d2d775a9ee206b1085544803c - Sigstore transparency entry: 1382675916
- Sigstore integration time:
-
Permalink:
MicheleDelliVeneri/ALMASim@01ec01e819f09ae05ac8d0a5f5d4e31778ad11d8 -
Branch / Tag:
refs/tags/v2.1.11 - Owner: https://github.com/MicheleDelliVeneri
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@01ec01e819f09ae05ac8d0a5f5d4e31778ad11d8 -
Trigger Event:
push
-
Statement type: