Skip to main content

Solar phenomena prediction models

Project description

SDO FM v2: A Multi-Instrument Foundation Model for the Solar Dynamics Observatory with Transferable Downstream Applications

Python 3.11+ PyTorch PyTorch Lightning License: MIT

Introduction

SDOFMv2 is an advanced multi-instrument foundation model designed to analyze Solar Dynamics Observatory (SDO) data and drive large-scale, data-driven heliophysics research. Building upon the original SDOFM framework, this version addresses previous limitations like restricted temporal coverage and reconstruction artifacts to significantly improve spatial coherence and global consistency.

Model architecture A Masked Autoencoder (MAE) based on a Vision Transformer (ViT) architecture is utilized for pretraining. During this phase, a% of the image patches are masked, while the remaining (100 - a)% are processed by the encoder. The decoder block then reconstructs all patches, optimized via a customized loss function.


Getting Started

Prerequisites

  • Linux or macOS
  • Python 3.11+
  • NVIDIA GPU + CUDA toolkit (Recommended for training)

Environment Setup

We recommend using mamba to manage dependencies.

Important Hardware Note: > The sdofmv2_environment.yml file is configured for CUDA 12.8 by default. If your hardware or drivers require a different CUDA version (e.g., CUDA 11.8), please open sdofmv2_environment.yml and modify the pip section at the bottom to match your system (e.g., change cu128 to cu118) before running the setup commands.

Using Mamba:

# Clone the repository
git clone [https://github.com/Joaggi/sdofmv2.git](https://github.com/Joaggi/sdofmv2.git)
cd sdofmv2

# Create and activate the environment (This automatically installs PyTorch and the local package)
mamba env create -f sdofmv2_environment.yml
mamba activate sdofmv2

Repository Structure

.
├── configs/                # YAML configurations for experiments
│   ├── downstream/         # Configs for downstream tasks (F10.7, solar wind)
│   └── pretrain/           # Configs for MAE pretraining (AIA, HMI)
├── notebooks/              # Jupyter notebooks for analysis and visualization
│   ├── analysis/           # Attention maps, PCA, and masking analysis
│   └── downstream_apps/    # How to use downstream scripts (Notebooks) for F10.7 and missing data applications
├── scripts/                # Executable scripts for training and testing
│   ├── pretrain.py         # Main pretraining script
│   ├── finetuning_*.py     # Scripts for downstream finetuning
│   └── test.py             # Script for evaluating checkpoints
├── src/                    # Core source code package
│   └── sdofmv2/
│       ├── core/           # Base model architectures and modules
│       ├── tasks/          # PyTorch Lightning modules (model & data module) for downstream tasks
│       └── utils/          # Helper functions, physical constants and metrics
├── pyproject.toml          # Project metadata and build dependencies
└── sdofmv2_environment.yml # Mamba environment definition file

How to Use

(Note: It is recommended to run all scripts from the root directory of the repository so that file paths to configs/ and src/ resolve correctly.)

1. Data Preparation

Before training or running inference, you need to prepare the dataset. [Explain where to download the data, or provide a command if you have a script for it.]

python scripts/download_data_cache.py --target_dir ./assets/

2. Training the Model

To train the model from scratch, execute the pretraining script and pass the relevant configuration file.

python scripts/pretrain.py --config-name pretrain_mae_AIA.yaml

3. Inference and Evaluation

To evaluate a pre-trained checkpoint on the test set:

python scripts/test.py --config-name pretrain_mae_AIA.yaml

4. Downstream Finetuning

To finetune the model on a specific downstream task (e.g., solar wind forecasting):

python scripts/finetuning_solarwind.py --config-name finetune_solarwind_config.yaml

Results & Visualizations

[Include a brief summary of the model's performance. You can add a table of metrics or a sample plot showing predictions vs. ground truth.]

Sample Visualization The first row displays the original ground-truth images. The second and third rows show the model's reconstructed images using masking ratios of 0% and 50%, respectively.


Citation

If you find this repository or model useful in your academic research, please consider citing our work:

@misc{sdofmv2,
  author = {Hong, Jinsu and Martin, Daniela and Gallego, Joseph},
  title = {SDOFMv2: A Multi-Instrument Foundation Model for the Solar Dynamics Observatory with Transferable Downstream Applications},
  year = {2026},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{[https://github.com/Joaggi/sdofmv2](https://github.com/Joaggi/sdofmv2)}},
  note = {Jinsu Hong, Daniela Martin, and Joseph Gallego contributed equally to this work}
}

Contributing

Contributions, bug reports, and feature requests are welcome! Please feel free to check the issues page or submit a pull request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sdofmv2-0.1.1.tar.gz (54.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sdofmv2-0.1.1-py3-none-any.whl (63.8 kB view details)

Uploaded Python 3

File details

Details for the file sdofmv2-0.1.1.tar.gz.

File metadata

  • Download URL: sdofmv2-0.1.1.tar.gz
  • Upload date:
  • Size: 54.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sdofmv2-0.1.1.tar.gz
Algorithm Hash digest
SHA256 f7183e1bf93b7081c48c175b77ee2252b615c1c1bb4eec99380431c36000e1a7
MD5 6ef4a6e4a2a0d3c19536cb4f94cca250
BLAKE2b-256 f489ac603fbfc4627f0196244df2dc7a509e145ecdeded636b3c369f2a88fa9b

See more details on using hashes here.

Provenance

The following attestation bundles were made for sdofmv2-0.1.1.tar.gz:

Publisher: publish.yml on Joaggi/sdofmv2

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sdofmv2-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: sdofmv2-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 63.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sdofmv2-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2e5797f1ac8c9385ed37e0afba2eeec570bb3d13b0194204f6087aa6d3ee84cd
MD5 80e43d06b8aeb58ed56fb1fe628c2342
BLAKE2b-256 7e75930af86aeee9154e1becb22addb2bf02e8056908e7147558a604169f5d38

See more details on using hashes here.

Provenance

The following attestation bundles were made for sdofmv2-0.1.1-py3-none-any.whl:

Publisher: publish.yml on Joaggi/sdofmv2

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page