Skip to main content

Transformer based foundation model for galaxy images (and general astronomy)

Project description

astroPT_shoggoth

ICML arXiv arXiv License: MIT All Contributors

astroPT: a Large Observation Model for astronomy 🔭

Welcome to our simple repository for training astronomical large observation models. This repository began its life as Andrej Karpathy's nanoGPT, and has been altered so that it is usable for imagery data. Within train.py you will find a ~300-line boilerplate training loop and within model.py you will find a ~300-line GPT model definition with an MLP tokeniser and a regressive loss.

Check out the UniverseTBD Discord for updates: https://discord.gg/MNEVegvfJq

install

You can install via pip from PyPI:

pip install astropt

Or if you install locally via a git clone, you can uv install via:

uv sync

how to run

To load and run a pre-trained AstroPT model from HuggingFace you can use the load_astropt function:

from astropt.model_utils import load_astropt

model, model_args = load_astropt(
    repo_id="smith42/astropt_sparse",
    path="astropt/p16k10",
    weights_filename="ckpt.pt",
)
model = model.to("cuda")

where repo_id is the HuggingFace repository ID, and path is the path within the repository that contains the AstroPT model checkpoint.

results

AstroPT v1.0.0 has been trained on 8.6M galaxy grz band *.png postage stamps downloaded from DESI-LS DR8 to see if neural scaling laws apply to galaxian data (in other words, to see if more galaxy data == more better model).
We tried to make the astroPT model as simple as possible so that other modalities can be easily folded in. We also choose to use a causally trained autoregressive transformer model as our backbone so that our work can more easily integrate the wider deep learning FOSS community.

Our pretraining task is feeding in our galaxy images patch-by-patch and predicting the next patch in our galaxy patch sequence. We follow ViT and define a patch as a 16 by 16 pixel square, and feed the galaxy patches in a spiral order:

galaxy

The trained model results are promising -- below we show our full training run validation losses across a parameter sweep of {1,5,12,21,89,309,830,2100}M trainable parameters:

scaling

We also test our astroPT models on some scientifically-useful downstream tasks by taking the models' penultimate layer outputs and finetuning linear probes to predict emergent physical properties of the galaxies:

downstream

In the above pic, $M_g$ and $M_z$ are the absolute magnitudes (or brightness at a fixed distance) of the galaxies, $g - r$ and $r - z$ are the differences between the observations of different telescope filter bands, redshift is the distance to the galaxies, sSFR is the total mass of new stars born each year in the galaxies per total galaxy mass, and $M_{*}$ is the total mass of stars within the galaxies. "smooth?", "disc?", "artefact?", "edge on?" and "tight spiral?" are morphological properties of the galaxies as described by citizen scientists.

The cool thing to take away from these plots is that the surrogate task loss (predicting the next patch in a sequence of ViT-like galaxy image patches) is correlated with astronomically "useful" downstream tasks 🤯🚀.

Finally, check out our UMAP projection of astroPT-87M's penultimate layer outputs of our validation set. We colour each point with an emergent physical galaxy property described above. The structure suggests that the model has learnt some knowledge about physics simply from our next-token prediction pretraining task!

hexbin

pretrained weights, and full galaxy dataset

Check out the paper here: https://arxiv.org/abs/2405.14930.

We of course release all our model weights checkpointed across our full training runs on HuggingFace 🤗 here.

We also release our full dataset and galaxy metadata on HuggingFace 🔥.

contributors

Ryan Roberts
Ryan Roberts

💻 🤔 🖋
Mike Smith
Mike Smith

💻 🤔 🖋 🔣
mhuertascompany
mhuertascompany

🤔 🖋
Malgorzata Siudek
Malgorzata Siudek

🤔 🖋 💻 🔣
gimarso
gimarso

🤔 💻
Add your contributions

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

astropt-1.0.4.tar.gz (67.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

astropt-1.0.4-py3-none-any.whl (15.5 kB view details)

Uploaded Python 3

File details

Details for the file astropt-1.0.4.tar.gz.

File metadata

  • Download URL: astropt-1.0.4.tar.gz
  • Upload date:
  • Size: 67.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.2

File hashes

Hashes for astropt-1.0.4.tar.gz
Algorithm Hash digest
SHA256 af944a1c60e793d9107d28e3c9794be3ae402edeb28349b872767169a536bfab
MD5 bd9e8b94f0febcb8ab3a45287febd2fe
BLAKE2b-256 9b6d98f7b936b0ddf30262e418aced2cdf55f93a4de8697bef39dfe40a5c0bdb

See more details on using hashes here.

File details

Details for the file astropt-1.0.4-py3-none-any.whl.

File metadata

  • Download URL: astropt-1.0.4-py3-none-any.whl
  • Upload date:
  • Size: 15.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.2

File hashes

Hashes for astropt-1.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 bcebd709266b6c3bce96d744934bcf53ec37314e35a5dfaa10f45ab2d754e833
MD5 c10c075efe0e406f5c9e232ae02bca6a
BLAKE2b-256 5f1f2d36036dfabdaafc23e3f7af5ba387c70ac863fefff2b2547957c1dac2ea

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page