Skip to main content

Solar Filaments data augmentation demo package

Project description

Torch Compatible Augmentation Engine For Solar Filaments v0.0.1

An ML-Ready Filament Augmentation Toolkit with Labeled Magnetic Helicity Sign

ABSTRACT

A halo Coronal Mass Ejection can have a devastating impact on Earth by causing damage to satellites and electrical transmission line facilities and disrupting radio transmissions. To predict the orientation of the magnetic field (and therefore the occurrence of a geomagnetic storm) associated with an occurring CME, filaments' sign of magnetic helicity can be used. This would allow us to predict a geomagnetic storm.

With the deluge of image data produced by ground-based and space-borne observatories and the unprecedented success of computer vision algorithms in detecting and classifying objects (events) on images, identification of filaments' chirality appears to be a well-fitted problem in this domain. To be more specific, Deep Learning algorithms with a Convolutional Neural Network (CNN) backbone are made to attack this very type of problem. The only challenge is that these supervised algorithms are data-hungry; their large number of model parameters demand millions of labeled instances to learn. Datasets of filaments with manually identified chirality, however, are costly to be built. This scarcity exists primarily because of the tedious task of data annotation, especially that identification of filaments' chirality requires domain expertise. In response, we created a pipeline for the augmentation of filaments based on the existing and labeled instances. This Python toolkit provides a resource of unlimited augmented (new) filaments with labeled magnetic helicity signs. Using an existing dataset of H-alpha based manually-labeled filaments as input seeds, collected from August 2000 to 2016 from the big bear solar observatory (BBSO) full-disk solar images, we augment new filament instances by passing labeled filaments through a pipeline of chirality-preserving transformation functions. This augmentation engine is fully compatible with PyTorch, a popular library for deep learning and generates the data based on users requirement.

Requirements

Linux/Mac/Windows OS: Installation


pip install augmentation_engine
Looking in indexes: https://test.pypi.org/simple/
Collecting filamentaugmentation
  Downloading https://test-files.pythonhosted.org/packages/45/45/4a66660af6982aec42ef1055c510120d3b6c105cbff4979b20a3a752c66d/filamentaugmentation-0.0.2-py3-none-any.whl (2.7 kB)
Installing collected packages: filamentaugmentation
Successfully installed filamentaugmentation-0.0.2
Note: you may need to restart the kernel to use updated packages.

Import Required Libraries

import os
from torchvision import transforms

from filament_augmentation.loader.filament_dataloader import FilamentDataLoader
from filament_augmentation.generator.filament_dataset import FilamentDataset
from filament_augmentation.metadata.filament_metadata import FilamentMetadata

To find out the number of left, right and unidentified chiralities for an interval of time.

  • The code snippet below gives the chirality distribution, i.e., the distribution of left, right and unidentified chiralities for an interval of time from "2015-08-01 17:36:15" to "2015-08-09 18:15:17".
  • Here the petdata has big bear space observatory(BBSO) full disk solar images from (01-09) aug 2015.
  • The format for start and end time should be YYYY-MM-DD HH:MM:SS.
  • The ann_file or annotation file is a H-alpha based manually labelled filaments in a json file comes with petdata.
__file__ = 'augmentation_process.ipynb'
bbso_json = os.path.abspath(
        os.path.join(os.path.dirname(__file__), 'petdata', 'bbso_json_data','2015_chir_data.json'))
filamentInfo = FilamentMetadata(ann_file = bbso_json, start_time = '2015-08-01 00:00:15',
                                    end_time = '2015-08-30 23:59:59')
filamentInfo.get_chirality_distribution()
(22, 30, 185)
  • In order to generate extra filaments for left, right or unidentified chirality by either balancing the data or getting them in required ratios to train them using an ML algorithm. A filament dataset class should be initialized which is quite similar to that of pytorch dataset class.
  • The transform list should be list of torchvision transformations
  • Filament ratio is tuple variable that takes (L,R,U).

Initializing Filament dataset

To initialize filament dataset class follow parameters are required:

  • bbso_path - BBSO full disk H-alpha solar images comes with petdata, path of the folder.
  • ann_file - a H-alpha based manually labelled filaments in a json file comes with petdata.
  • The format for start and end time should be YYYY-MM-DD HH:MM:SS.
bbso_path = os.path.abspath(os.path.join(os.path.dirname(__file__), 'petdata', '2015'))
dataset = FilamentDataset(bbso_path = bbso_path, ann_file = bbso_json, 
                          start_time = "2015-08-01 17:36:15", end_time = "2015-08-09 17:36:15")
loading annotations into memory...
Done (t=0.05s)
creating index...
index created!

Setup transformations for data augmentation

The transformations function can be refered from torchvision transforms

  • Here transforms variable should have list of torchvision transforms functions as shown below:
transforms1 = [
    transforms.ColorJitter(brightness=(0.25,1.25), contrast=(0.25,2.00), saturation=(0.25,2.25)),
    transforms.RandomRotation(15,expand=False,fill=110)
]

Initializing data loader

  • dataset = object of filament dataset class.
  • batch_size = number of filaments to be generated per batch.
  • filament_ratio = tuple of three values, i.e., ratios of left, right and unidentified chirality to be generated in each batch.
  • n_batchs = number of batchs.
  • transforms = list of torchvision transformations functions
  • image_dim = image dimensions if image dimension is -1 then image will not be resize, i.e., output is original image size.
data_loader = FilamentDataLoader(dataset = dataset,batch_size = 3 , filament_ratio = (1, 1, 1),n_batchs = 10, 
                                 transforms = transforms1, image_dim = 224)

How to generate 3 filament images for every batch with ratio of left as 1, right chirality as 1 and unidentified as 1 for 10 batches with original image dimension and display the images?

data_loader = FilamentDataLoader(dataset = dataset,batch_size = 3 , filament_ratio = (1, 1, 1),
                                 n_batchs = 10, transforms = transforms1, image_dim = -1)

Batch -1 augmented filament images and their following labels (1=R, 0= U, -1=L)

for original_imgs, transformed_imgs, labels in data_loader:
    for org_img, img, label in zip(original_imgs ,transformed_imgs, labels):
        print("Original image")
        plt.imshow(org_img, cmap='gray')
        plt.show()
        print("Transformed image")
        plt.imshow(img, cmap='gray')
        plt.show()
        print("Label",label)
    break
Original image



---------------------------------------------------------------------------

NameError                                 Traceback (most recent call last)

augmentation_process.ipynb in <module>
      2     for org_img, img, label in zip(original_imgs ,transformed_imgs, labels):
      3         print("Original image")
----> 4         plt.imshow(org_img, cmap='gray')
      5         plt.show()
      6         print("Transformed image")


NameError: name 'plt' is not defined

How to generate 12 filament images for every batch with ratio of left as 2, right chirality as 3 and unidentified as 1 for 5 batches with image dimension of 224x224 ?

data_loader = FilamentDataLoader(dataset = dataset,batch_size = 12 , filament_ratio = (2, 3, 1),
                                 n_batchs = 5, transforms = transforms1, image_dim = 224)
for _, imgs, labels in data_loader:
    print("size of images ",imgs.shape)
    print("labels for each batch ",labels)
size of images  torch.Size([12, 224, 224])
labels for each batch  tensor([[-1],
        [ 1],
        [ 1],
        [-1],
        [-1],
        [ 0],
        [ 1],
        [ 1],
        [-1],
        [ 1],
        [ 0],
        [ 1]])
size of images  torch.Size([12, 224, 224])
labels for each batch  tensor([[-1],
        [-1],
        [ 1],
        [ 1],
        [-1],
        [ 1],
        [ 1],
        [-1],
        [ 1],
        [ 0],
        [ 0],
        [ 1]])
size of images  torch.Size([12, 224, 224])
labels for each batch  tensor([[ 1],
        [ 1],
        [-1],
        [ 1],
        [-1],
        [ 1],
        [ 1],
        [-1],
        [-1],
        [ 1],
        [ 0],
        [ 0]])
size of images  torch.Size([12, 224, 224])
labels for each batch  tensor([[ 1],
        [ 1],
        [-1],
        [ 1],
        [-1],
        [ 1],
        [ 0],
        [-1],
        [-1],
        [ 1],
        [ 0],
        [ 1]])
size of images  torch.Size([12, 224, 224])
labels for each batch  tensor([[ 1],
        [-1],
        [ 1],
        [-1],
        [ 1],
        [ 1],
        [ 0],
        [ 1],
        [ 0],
        [-1],
        [ 1],
        [-1]])

How to generate 10 filament images for every batch only for left and right chirality for 5 batches with image dimension of 224x224 ?

  • In order to remove one type of chiraity, filament ratio, i.e., tuple(L, R, U):
    • if L=0 that means left chirality is eliminated. Similarly, this applies to other types as well.
data_loader = FilamentDataLoader(dataset = dataset,batch_size = 10 , filament_ratio = (1, 1, 0),
                                 n_batchs = 5, transforms = transforms1, image_dim = 224)
for _, imgs, labels in data_loader:
    print("size of images ",imgs.shape)
    print("labels for each batch ",labels)
size of images  torch.Size([10, 224, 224])
labels for each batch  tensor([[-1],
        [ 1],
        [-1],
        [ 1],
        [ 1],
        [-1],
        [-1],
        [-1],
        [ 1],
        [ 1]])
size of images  torch.Size([10, 224, 224])
labels for each batch  tensor([[ 1],
        [-1],
        [-1],
        [ 1],
        [-1],
        [-1],
        [ 1],
        [-1],
        [ 1],
        [ 1]])
size of images  torch.Size([10, 224, 224])
labels for each batch  tensor([[ 1],
        [ 1],
        [-1],
        [-1],
        [ 1],
        [-1],
        [-1],
        [-1],
        [ 1],
        [ 1]])
size of images  torch.Size([10, 224, 224])
labels for each batch  tensor([[ 1],
        [-1],
        [ 1],
        [ 1],
        [-1],
        [ 1],
        [-1],
        [-1],
        [ 1],
        [-1]])
size of images  torch.Size([10, 224, 224])
labels for each batch  tensor([[-1],
        [ 1],
        [-1],
        [ 1],
        [-1],
        [ 1],
        [ 1],
        [-1],
        [ 1],
        [-1]])

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

augmentation_engine-0.0.1.tar.gz (22.3 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

augmentation_engine-0.0.1-py3.8.egg (22.5 MB view details)

Uploaded Egg

augmentation_engine-0.0.1-py3-none-any.whl (22.5 MB view details)

Uploaded Python 3

File details

Details for the file augmentation_engine-0.0.1.tar.gz.

File metadata

  • Download URL: augmentation_engine-0.0.1.tar.gz
  • Upload date:
  • Size: 22.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.8.10

File hashes

Hashes for augmentation_engine-0.0.1.tar.gz
Algorithm Hash digest
SHA256 e3a2d8409e2b9eddda1fc3c027a5414d8907d75565336c8a42f87e2ed3be3c80
MD5 1f4529472f1a8a9d858d8dbb8363af20
BLAKE2b-256 86b65c7b350a2b50a18562ead33fd9c64b07fa22280c40bc9e2ec19de096c36f

See more details on using hashes here.

File details

Details for the file augmentation_engine-0.0.1-py3.8.egg.

File metadata

  • Download URL: augmentation_engine-0.0.1-py3.8.egg
  • Upload date:
  • Size: 22.5 MB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.8.10

File hashes

Hashes for augmentation_engine-0.0.1-py3.8.egg
Algorithm Hash digest
SHA256 046cc348883f8b7f55c96e3ea58a3eb0ea32c0c7be96b6ec3d59a6af794c84d0
MD5 24fa344ac26cbbb777497c806218d9d7
BLAKE2b-256 e164243984879e5859a00d8bb9fbef99b64af0016158c1b0803806b3064c2a5c

See more details on using hashes here.

File details

Details for the file augmentation_engine-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: augmentation_engine-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 22.5 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.8.10

File hashes

Hashes for augmentation_engine-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c663037a71a57ba04e3f1c022d9c1e6e0c882c4b6dc05250ad17d55071f53d09
MD5 a9260625ef431190b0f284f5eaf12d0d
BLAKE2b-256 afe9da25769e966bab396552e18ca8ca3ed13b1472e3370a9971f63eccb65802

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page