Skip to main content

Solar phenomena prediction models

Project description

SDO FM v2: [Full Title of the Project/Model]

Python 3.11+ PyTorch PyTorch Lightning License: MIT

Introduction

SDOFMv2 is an advanced multi-instrument foundation model designed to analyze Solar Dynamics Observatory (SDO) data and drive large-scale, data-driven heliophysics research. Building upon the original SDOFM framework, this version addresses previous limitations like restricted temporal coverage and reconstruction artifacts to significantly improve spatial coherence and global consistency.

Model architecture A Masked Autoencoder (MAE) based on a Vision Transformer (ViT) architecture is utilized for pretraining. During this phase, a% of the image patches are masked, while the remaining (100 - a)% are processed by the encoder. The decoder block then reconstructs all patches, optimized via a customized loss function.


Getting Started

Prerequisites

  • Linux or macOS
  • Python 3.11+
  • NVIDIA GPU + CUDA toolkit (Recommended for training)

Environment Setup

We recommend using mamba to manage dependencies.

Important Hardware Note: > The sdofmv2_environment.yml file is configured for CUDA 12.8 by default. If your hardware or drivers require a different CUDA version (e.g., CUDA 11.8), please open sdofmv2_environment.yml and modify the pip section at the bottom to match your system (e.g., change cu128 to cu118) before running the setup commands.

Using Mamba:

# Clone the repository
git clone [https://github.com/Joaggi/sdofmv2.git](https://github.com/Joaggi/sdofmv2.git)
cd sdofmv2

# Create and activate the environment (This automatically installs PyTorch and the local package)
mamba env create -f sdofmv2_environment.yml
mamba activate sdofmv2

Repository Structure

.
├── configs/                # YAML configurations for experiments
│   ├── downstream/         # Configs for downstream tasks (F10.7, solar wind)
│   └── pretrain/           # Configs for MAE pretraining (AIA, HMI)
├── notebooks/              # Jupyter notebooks for analysis and visualization
│   ├── analysis/           # Attention maps, PCA, and masking analysis
│   └── downstream_apps/    # How to use downstream scripts (Notebooks) for F10.7 and missing data applications
├── scripts/                # Executable scripts for training and testing
│   ├── pretrain.py         # Main pretraining script
│   ├── finetuning_*.py     # Scripts for downstream finetuning
│   └── test.py             # Script for evaluating checkpoints
├── src/                    # Core source code package
│   └── sdofmv2/
│       ├── core/           # Base model architectures and modules
│       ├── tasks/          # PyTorch Lightning modules (model & data module) for downstream tasks
│       └── utils/          # Helper functions, physical constants and metrics
├── pyproject.toml          # Project metadata and build dependencies
└── sdofmv2_environment.yml # Mamba environment definition file

How to Use

(Note: It is recommended to run all scripts from the root directory of the repository so that file paths to configs/ and src/ resolve correctly.)

1. Data Preparation

Before training or running inference, you need to prepare the dataset. [Explain where to download the data, or provide a command if you have a script for it.]

python scripts/download_data_cache.py --target_dir ./assets/

2. Training the Model

To train the model from scratch, execute the pretraining script and pass the relevant configuration file.

python scripts/pretrain.py --config-name pretrain_mae_AIA.yaml

3. Inference and Evaluation

To evaluate a pre-trained checkpoint on the test set:

python scripts/test.py --config-name pretrain_mae_AIA.yaml

4. Downstream Finetuning

To finetune the model on a specific downstream task (e.g., solar wind forecasting):

python scripts/finetuning_solarwind.py --config-name finetune_solarwind_config.yaml

Results & Visualizations

[Include a brief summary of the model's performance. You can add a table of metrics or a sample plot showing predictions vs. ground truth.]

Sample Visualization The first row displays the original ground-truth images. The second and third rows show the model's reconstructed images using masking ratios of 0% and 50%, respectively.


Citation

If you find this repository or model useful in your academic research, please consider citing our work:

@misc{sdofmv2,
  author = {Hong, Jinsu and Martin, Daniela and Gallego, Joseph},
  title = {SDOFMv2: A Multi-Instrument Foundation Model for the Solar Dynamics Observatory with Transferable Downstream Applications},
  year = {2026},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{[https://github.com/Joaggi/sdofmv2](https://github.com/Joaggi/sdofmv2)}},
  note = {Jinsu Hong, Daniela Martin, and Joseph Gallego contributed equally to this work}
}

Contributing

Contributions, bug reports, and feature requests are welcome! Please feel free to check the issues page or submit a pull request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sdofmv2-0.1.0.tar.gz (54.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sdofmv2-0.1.0-py3-none-any.whl (63.8 kB view details)

Uploaded Python 3

File details

Details for the file sdofmv2-0.1.0.tar.gz.

File metadata

  • Download URL: sdofmv2-0.1.0.tar.gz
  • Upload date:
  • Size: 54.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sdofmv2-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f93952490ae4d069f3386bff7ec41e420d2b86811d08dc5e7bc2e5250ae0902b
MD5 850fdd2aa2e342f43f30ee69af6d2882
BLAKE2b-256 0403b58a169046ad4db47f242c421891cad6e9ce1d91a8733f0cf614e62673b5

See more details on using hashes here.

Provenance

The following attestation bundles were made for sdofmv2-0.1.0.tar.gz:

Publisher: publish.yml on Joaggi/sdofmv2

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sdofmv2-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: sdofmv2-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 63.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sdofmv2-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f9f102e84db85768b48c98c8362e1528e5694f940406cb42fb9815f47b070335
MD5 b0a581d1de34c7e6c4fca84c3e89c0be
BLAKE2b-256 29ab35b6e6558b1f5e8d5430608759c8e57683bed4ab9398aa8eb85a9b1e1e66

See more details on using hashes here.

Provenance

The following attestation bundles were made for sdofmv2-0.1.0-py3-none-any.whl:

Publisher: publish.yml on Joaggi/sdofmv2

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page