Skip to main content

No project description provided

Project description

Radio Galaxy Dataset

DOI License

This Radio Galaxy Dataset is a collection and combination of several catalogues using the FIRST radio galaxy survey [1]. To the images from the FIRST radio galaxy survey the following license applies:

"Provenance: The FIRST project team: R.J. Becker, D.H. Helfand, R.L. White M.D. Gregg. S.A. Laurent-Muehleisen. Copyright: 1994, University of California. Permission is granted for publication and reproduction of this material for scholarly, educational, and private non-commercial use. Inquiries for potential commercial uses should be addressed to: Robert Becker, Physics Dept, University of California, Davis, CA 95616:

Further, the following catalogues are included in this dataset:

  • MiraBest [2], Source
  • Gendre [3-4], Supplementary Data: mnras0404-1719-SD1.pdf, data tables CoNFIG-1 to CoNFIG-4
  • Capetti 2017a [5], Table
  • Capetti 2017b [6], Table
  • Baldi 2018 [7], Table
  • Proctor [8], Table, data from Table 1 with label “WAT” and “NAT”

Examples for the class definitions of FRI, FRII, Compact and Bent are shown below, image with the labels

classes Label
FRI 0
FRII 1
Compact 2
Bent 3

The dataset has the following total number of samples per class.

classes/split FRI FRII Compact Bent Total
total 495 924 391 348 2158

We provide two splitting options for the dataset. The first splitting option (galaxy_data_h5.zip) provides three splittings in train, valid and test with the following number of sample per class.

classes/split FRI FRII Compact Bent Total
train 395 824 291 248 1758
valid 50 50 50 50 200
test 50 50 50 50 200
total 495 924 391 348 2158

The second splitting option (galaxy_data_crossvalid_0_h5.zip to galaxy_data_crossvalid_4_h5.zip and galaxy_data_crossvalid_test_h5.zip) provides a 5-fold cross validation dataset with a larger test set.

classes/split FRI FRII Compact Bent Total
5-fold cross train 316 659 232 198 1405
5-fold cross valid 79 165 59 50 353
test 100 100 100 100 400
total 495 924 391 348 2158

Installation usage with pytorch

If you want to use the dataset via the dataset class FIRSTGalaxyData with pytorch, install the necessary packages with

pip3 install -r requirements.txt

first, otherwise you can use the dataset

  • directly with *.png files on disk or
  • load the dataset directly from the HDF5 file.

Both options are descibed further below.

Usage with pytorch

from firstgalaxydata import FIRSTGalaxyData
import torchvision.transforms as transforms
transformRGB = transforms.Compose(
        [transforms.ToTensor(),
         transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])])
data = FIRSTGalaxyData(root="./", selected_split="train", input_data_list=["galaxy_data_h5.h5"],
                           is_PIL=True, is_RGB=True, transform=transformRGB)

print(data)

This will print out the following output:

    Selected classes: dict_values(['FRI', 'FRII', 'Compact', 'Bent'])
    Number of datapoints in total: 1758
    Number of datapoint in class FRI: 395
    Number of datapoint in class FRII: 824
    Number of datapoint in class Compact: 291
    Number of datapoint in class Bent: 248
    Split: train
    Root Location: ./
    Transforms (if any): Compose(
                             ToTensor()
                             Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
                         )
    Target Transforms (if any): None

Options

With selected_split the data split is selected. Choose either "train" or "valid" or "test".

With selected_classes only data containing the chosen classes is returned. e.g. ["FRI",FRII"] returns only FRI and FRII images.

With selected_catalogues the dataset uses only the selected catalogues. All possible catalogues are listed here:

selected_catalogues= ["Gendre", "MiraBest", "Capetti2017a", "Capetti2017b", "Baldi2018", "Proctor_Tab1"]

data = FIRSTGalaxyData(root="./", selected_split="train", input_data_list=["galaxy_data_h5.h5"], selected_catalogues=selected_catalogues, is_PIL=True, is_RGB=True, transform=transformRGB)

Basic usage with files on disk

You will also find the dataset in the 'galaxy_data' folder by unzipping galaxy_data.zip. It contains the following folder sturcture with *.png images. The most import information will also be part of the file name separated by underscores: RA_DEC_Label_Source.png E.g. 14.084_-9.608_3_MiraBest.png

galaxy_data  
│
└───all
│   │   Bent
|   |       *.png  
│   │   Compact
|   |       *.png  
|   |   FRI
|   |       *.png  
│   │   FRII
|   |       *.png  
│   
└───test
│   │   Bent
|   |       *.png  
│   │   Compact
|   |       *.png  
|   |   FRI
|   |       *.png  
│   │   FRII
|   |       *.png
│   
└───train
│   │   Bent
|   |       *.png  
│   │   Compact
|   |       *.png  
|   |   FRI
|   |       *.png  
│   │   FRII
|   |       *.png
│   
└───valid
│   │   Bent
|   |       *.png  
│   │   Compact
|   |       *.png  
|   |   FRI
|   |       *.png  
│   │   FRII
|   |       *.png

Basic usage with HDF5 file

The dataset can also be accessed via the HDF5 file galaxy_data_h5.h5. Every data entry consists of a group named data_$(i) with i=1...n where n is the total number of data entries. Each group consists of the following data:

  • Img: two-dimensional uint8 array with (300,300)
    • Attributes of Img:
    • RA right ascension equatorial coordinate system (J2000): double
    • DEC declination equatorial coordinate system (J2000): double
    • Source: string, ["Gendre", "MiraBest", "Capetti2017a", "Capetti2017b", "Baldi2018", "Proctor_Tab1"]
    • Filepath_literature: string, relative path to the *.png file in the folder galaxy_data
  • Label_literature: double scalar, 0: ”FRI”, 1: ”FRII”, 2: ”Compact”, 3: ”Bent”
  • Split_literature: string, ["train","test","valid"]

References

[1] R. H. Becker, R. L. White, D. J. Helfand, The FIRST Survey: Faint Images of the Radio Sky at Twenty Centimeters, The Astrophysical Journal 450 (1995) 559.

[2] H. Miraghaei, P. N. Best, The nuclear properties and extended morphologies of powerful radio galaxies: the roles of host galaxy and environment, Monthly Notices of the Royal Astronomical Society (2017) stx007.

[3] M. A. Gendre, P. N. Best, J. V. Wall, The combined nvss-first galaxies (config) sample - ii. comparison of space densities in the fanaroff-riley dichotomy, Monthly Notices of the Royal Astronomical Society (2010).

[4] M. A. Gendre, J. V. Wall, The combined nvss-first galaxies (config) sample - i. sample definition, classification and evolution, Monthly Notices of the Royal Astronomical Society (2008).

[5] A. Capetti, F. Massaro, R. D. Baldi, Fricat: A first catalog of fr i radio galaxies, Astronomy & Astrophysics 598 (2017) A49.

[6] A. Capetti, F. Massaro, R. D. Baldi, Friicat: A first catalog of fr ii radio galaxies, Astronomy & Astrophysics 601 (2017) A81.

[7] R. D. Baldi, A. Capetti, F. Massaro, Fr0cat: a first catalog of fr 0 radio galaxies, Astronomy & Astrophysics 609 (2017) A1.

[8] D. D. Proctor, Morphological annotations for groups in the first database, The Astrophysical Journal Supplement Series 194 (2011) 31.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

firstgalaxydata-0.2.0.tar.gz (16.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

firstgalaxydata-0.2.0-py3-none-any.whl (16.1 MB view details)

Uploaded Python 3

File details

Details for the file firstgalaxydata-0.2.0.tar.gz.

File metadata

  • Download URL: firstgalaxydata-0.2.0.tar.gz
  • Upload date:
  • Size: 16.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.6

File hashes

Hashes for firstgalaxydata-0.2.0.tar.gz
Algorithm Hash digest
SHA256 00bcc59be6a36c95bae1e7312f8c55eebae9b25cbb885375010c03a17bfbcffb
MD5 bbc56a189d8257e11ece85ca53abc5c3
BLAKE2b-256 7b063e7fbaf8491785fbbf01db258d13acec5dbddcbe92ffab75e6418d431108

See more details on using hashes here.

File details

Details for the file firstgalaxydata-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for firstgalaxydata-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e34ec27cc1a195b0a8fdb125c687335073bda54d5921145344fc89f41718b386
MD5 badbe13bb0ef9cdc901e65fbd691f21d
BLAKE2b-256 3b6d36df7e988dce2dbadfc8e93ad90c803ec1b72fb6c7c793f72a93119499e1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page