Skip to main content

CASCADe : Calibration of trAnsit Spectroscopy using CAusal Data

Project description

pipeline status

CASCADe : Calibration of trAnsit Spectroscopy using CAusal Data

At present several thousand transiting exoplanet systems have been discovered. For relatively few systems, however, a spectro-photometric characterization of the planetary atmospheres could be performed due to the tiny photometric signatures of the atmospheres and the large systematic noise introduced by the used instruments or the earth atmosphere. Several methods have been developed to deal with instrument and atmospheric noise. These methods include high precision calibration and modeling of the instruments, modeling of the noise using methods like principle component analysis or Gaussian processes and the simultaneous observations of many reference stars. Though significant progress has been made, most of these methods have drawbacks as they either have to make too many assumptions or do not fully utilize all information available in the data to negate the noise terms.

The CASCADe package, developed within the EC Horizons 2020 project Exoplanets A , implements a full spectroscopic pipeline for HST/WFC3 and Spitzer/IRS spectroscopic timeseries observations as well as lightcurve calibration and fitting functionality. The CASCADe project implements a novel “data driven” method, pioneered by Schoelkopf et al (2016) utilizing the causal connections within a data set, and uses this to calibrate the spectral timeseries data of a transiting systems, observed in a single object mode. The current code has been tested successfully on spectroscopic data obtained with the Spitzer and HST observatories, as well as JWST MIRI simulations.

Installing CASCADe

The easiest way to install the CASCADe package is to create an Anaconda environment, download the distribution from PyPi, and install the package in the designated Anaconda environment with the following commands:

conda create --name cascade python=3.9 ipython
conda activate cascade
pip install CASCADe-spectroscopy

This will install all code and scripts you need for the package to work.

NOTE: CASCADe The batman package is only guaranteed to work when using numpy version 1.22.1, and with this numpy version one should install version 0.53.1 of the numba package.

To update an existing installation to the latest release, simply use the -U option with pip:

conda activate cascade
pip install CASCADe-spectroscopy -U

Installing the required CASCADe data and examples

All necessary data needed by CASCADe for it to work properly, such as calibration files for the different spectroscopic instruments of HST and Spitzer and configuration templates, need be downloaded from the GitLab repository before running the code. To initialize the data download and setup of the CASCADe data storage one can use the following bash command:

setup_cascade.py

or alternatively from within the python interpreter:

from cascade.initialize import initialize_cascade
initialize_cascade()

The additional downloaded data also includes examples and observational data to try out the CASCADe package, which are explained below.

NOTE: The data files will be downloaded by default to a CASCADeSTORAGE/ directory in the users home directory. If a different location is preferred, please read the section on how to set the CASCADe environment variables first.

Installing alternatives for the CASCADe package

The CASCADe code can also be downloaded from GitLab directly by either using git or pip. To download and install with a single command using pip, type in a terminal the following command

pip install git+git://gitlab.com/jbouwman/CASCADe.git@master

which will download the latest version. For other releases replace the master branch with one of the available releases on GitLab. Alternatively, one can first clone the repository and then install, either using the HTTPS protocal:

git clone https://gitlab.com/jbouwman/CASCADe.git

or clone using SSH:

git clone git@gitlab.com:jbouwman/CASCADe.git

Both commands will download a copy of the files in a folder named after the project's name. You can then navigate to the directory and start working on it locally. After accessing the root folder from terminal, type

pip install .

to install the package.

In case one is installing CASCADe directly from GitLab, and one is using Anaconda, make sure a cascade environment is created and activated before using our package. For convenience, in the CASCADe main package directory an environment.yml can be found. You can use this yml file to create or update the cascade Anaconda environment. If you not already had created an cascade environment execute the following command:

conda env create -f environment.yml

In case you already have an cascade environment, you can update the necessary packages with the following command (also use this after updating CASCADe itself):

conda env update -f environment.yml

Make sure the CASCADe package is in your path. You can either set a PYTHONPATH environment variable pointing to the location of the CASCADe package on your system, or when using anaconda with the following command:

conda develop <path_to_the_CASCADe_package_on_your_system>/CASCADe

Using CASCADe

The CASCADe distribution comes with a few working examples and data sets which can be found in the examples directory of the CASCADe distribution on GitLab, and which should have been copied to the storage directory specified by the CASCADE_STORAGE_PATH environment variable. All needed parameters for running the code are defined by initialization files and a few environment variables to be set by the user.

CASCADe Environment Variables

CASCADe uses the following environment variables:


    CASCADE_STORAGE_PATH
        Default path to all functional data needed by the CASCADe package.
    CASCADE_DATA_PATH
        Default path to all observational data to be analyzed with CASCADe.
    CASCADE_SAVE_PATH
        Default path to where CASCADe saves all pipeline output.
    CASCADE_INITIALIZATION_FILE_PATH:
        Default path to CASCADe pipeline initialization files.
    CASCADE_SCRIPTS_PATH
        Default path to the CASCADe data reduction pipeline scripts for each
        observation.
    CASCADE_LOG_PATH:
        Default path to the saved CASCADe log files.
    CASCADE_WARNINGS
        Switch to show or not show warnings. Can either be 'on' or 'off'

These environment variables control where CASCADe searches and stores all required data. In case the environment variables are not set by the user, CASCADe uses default values defined in the initialize module. The main environment variable is CASCADE_STORAGE_PATH. This environment variable, if not set by the user has a default value of <user_home_directory>/CASCADeSTORAGE/. All other path settings stored in the other environment variables are set relative to this, so in principle the user only has to set the CASCADE_STORAGE_PATH variable to the preferred location, such as a directory on a larger file system rather then in the user home directory. The CASCADE_DATA_PATH, CASCADE_SAVE_PATH, CASCADE_INITIALIZATION_FILE_PATH, CASCADE_SCRIPTS_PATH and CASCADE_LOG_PATH are set by default to data/, results/, init_files/, scripts/ and logs/, respectively, all relative to the path defined by CASCADE_STORAGE_PATH. The CASCADE_WARNINGS variable is by default set to "on" to show possible warnings generated by the CASCADe code.

CASCADe Script Examples

The distribution comes with three examples demonstrating the use of the CASCADe package. These use cases, together with the observational data can be found in the 'examples/' sub-directory of the GitLab repository, and should be installed (see above) in the directory defined by the CASCADE_STORAGE_PATH and other environment variables. The examples cover both HST and Spitzer data as well as an example for a Generic spectroscopic dataset, in this case an observation with the GMOS instrument on Gemini.

Example 1a: Extracting a Spectroscopic Timeseries from HST WFC3 spectral images.

In this example we demonstrate how extract a spectral timeseries from spectral images. For this we use observations of WASP-19b with the Wide Field Camera 3 (WFC3) on board the HST observatory. As these observations were made in "Staring Mode", we use as a start the flt data product. The spectral images can be found in the HST/WFC3/WASP-19b_ibh715/SPECTRAL_IMAGES sub-directory of the main data directory specified by the CASCADE_DATA_PATH environment variable.

The pipeline script for this example can be found in the HST/WFC3/ sub-directory of the scripts directory as specified by the CASCADE_SCRIPTS_PATH environment variable. To run this example, execute the following script:

python3 run_CASCADe_WASP19b_extract_timeseries.py

The individual steps in this script are commented and a detailed explanation on the reduction steps can be found in the CASCADe documentation (see below). Note that while running the script, interactive plots are opened which the user needs to close before the script continues. One can prevent plots from opening by un-commenting the line matplotlib.use('AGG') in the python script.

The initialization files for this example can be found in the HST/WFC3/WASP-19b_ibh715_example sub-directory of the directory specified by the CASCADE_INITIALIZATION_FILE_PATH environment variable. The cascade_WASP19b_object.ini contains all parameters specifying the target star and planet. The ephemeris and orbital period given in the .ini file are used to calculated an orbital phase which is attached to the individual spectra in the final timeseries. In case of HST WFC3 spectra, as is the case here, the stellar parameters are used to calculated an expected spectrum, which is then used to determine a global shift of the spectrum in the wavelength direction. It is, therefore, important that the effective temperature, logg and metallicity parameters are close as possible to the correct values of the system which the observations are being analyzed. All other parameters controlling the behavior of CASCADe are given in the cascade_WASP19b_extract_timeseries.ini initialization file. The parameters are grouped different sections in a logical way, such as parameters controlling the processing steps, describing the observations and data and overall behavior of the code. Detailed information on each of these parameters can be found in the documentation (see below.)

Executing the script results in the extraction of the individual (1d) spectra. These are stored as fits files in the sub-directory SPECTRA at the same level as the SPECTRAL_IMAGES directory containing the spectral images. Two types of files are written: COE, which are the CASCADe Optimal Extraction spectra and CAE, which are the CASCADe Aperture Extraction spectra, produced using respectively, optimal extraction or an extraction aperture. For further details we refer to the documentation. Next to the spectral fits files, also several diagnostic plots are produced which can be found in the HST/WFC3/WASP-19b_ibh715_transit_output_from_extract_timeseries sub-directory of the directory specified by the CASCADE_SAVE_PATH environment variable.

Example 1b: Calibrating a HST WFC3 Spectroscopic Timeseries and extracting a transit spectrum.

After creating a spectral timeseries from the spectral images in Example 1a, we can proceed with the calibration of the spectral lightcurves and deriving the planetary spectrum. To demonstrate how to use CASCADe for this, we will take the extracted COE spectra of the WASP 19b observation from Example 1a and proceed to characterize the systematics and extract the transit spectrum. To run Example 1b, execute the following script:

python3 run_CASCADe_WASP19b_calibrate_planet_spectrum.py

As with the first example, the individual steps in this script are commented and a detailed explanation on the reduction steps can be found in the CASCADe documentation (see below). The initialization files for this example can be found in the same directory as the initialization files for example 1a. We use again the cascade_WASP19b_object.ini file to specify the target star and orbital parameters. In contrast to the previous example, all system parameters specified in this initialization file are relevant.

NOTE: The current CASCADe version only fits for the transit depth. All other system parameters such as the ephemeris, period, inclination and semi-major axis are fixed to the values specified in the initialization file.

All other parameters controlling the behavior of the CASCADe pipline are given in the cascade_WASP19b_calibrate_planet_spectrum.ini initialization file. A detailed explanation of the control parameters is given in the documentation.

The resulting transit spectrum and diagnostic plots are stored in the HST/WFC3/WASP-19b_ibh715_transit_from_hst_wfc3_spectra sub-directory of the directory specified by CASCADE_SAVE_PATH environment variable. The calibrated spectrum of WASP 19b is stored in the WASP-19b_ibh715_bootstrapped_exoplanet_spectrum.fits and the derived systematics for this dataset in WASP-19b_ibh715_bootstrapped_systematics_model.fits

Example 2: Calibrating a Spitzer/IRS Spectroscopic Timeseries and extracting a transit spectrum.

The CASCADe package can not only calibrate observations with the WFC3 instrument onboard HST, but can also handle transit spectroscopy observations with the IRS instrument onboard the Spitzer Space Observatory. As an example, we analyze Spitzer/IRS observations of an eclipse of HD189733b, using the with the CASCADe package pre-extracted COE spectral data product. The data can be found in the SPITZER/IRS/HD189733b_AOR23439616/SPECTRA/ sub-directory of the main data directory specified by the CASCADE_DATA_PATH environment variable. The pipeline script for this example can be found in the SPITZER/IRS/ sub-directory of the scripts directory as specified by the CASCADE_SCRIPTS_PATH environment variable. To run Example 2, execute the following script:

python3 run_CASCADe_HD189733b_calibrate_planet_spectrum.py

The pipeline steps used in this example are identical to the ones of Example 1b. The initialization files for example 2 can be found in the SPITZER/IRS/HD189733b_AOR23439616_example sub-directory of the directory specified by the CASCADE_INITIALIZATION_FILE_PATH environment variable. Similar to the first example, the cascade_HD189733b_object.ini file contains all parameters specifying the target star and orbital parameters, while the cascade_HD189733b_calibrate_planet_spectrum.ini initialization file specifies all other parameters controlling the behavior of the CASCADe pipeline. The HD189733b eclipse spectrum and diagnostic plots are stored in the SPITZER/IRS/HD189733b_AOR23439616_eclipse_from_spitzer_irs_spectra sub-directory of the directory specified by the CASCADE_SAVE_PATH environment variable.

Example 3: Calibrating a GEMINI/GMOS Spectroscopic Timeseries and extracting a transit spectrum.

As a final example we show how to use CASCADe for spectral timeseries extracted with another software package for a generic instrument. Though spectral extraction from spectral images or cubes is currently only implemented for the WFC3 instrument of HST and the IRS instrument of Spitzer, the calibration of spectral lightcurves and derivation of the planetary spectrum can be performed for any generic spectroscopic timeseries. The previous examples showed how to use CASCADe with HST and Spitzer observations. In this example we use an observation of WASP-103b with the GMOS instrument installed at the Gemini telescope (See Lendl et al 2017, A&A 606).

The spectral timeseries data for this example is located in the Generic/Gemini/GMOS/WASP103b/SPECTRA/ sub-directory of the main data directory. To be able to run this example we stored the GMOS spectra as fits files with an identical format as the spectral fits files created by CASCADe . The pipeline script is located in the Generic/Gemini/GMOS/ sub-directory in the scripts directory.

To run this example, execute the following script:

python3 run_CASCADe_WASP103b_calibrate_planet_spectrum.py

The initialization files for example 2 can be found in the Generic/Gemini/GMOS/WASP-103b_example/ sub-directory of the directory specified by the CASCADE_INITIALIZATION_FILE_PATH environment variable. Similar to the other examples, the cascade_WASP103b_object.ini initialization file contains all parameters defining the system, and the cascade_WASP103b_calibrate_planet_spectrum.ini file contains all other parameters needed by the CASCADe pipeline. The WASP-103 b transit spectrum and diagnostic plots are stored in the Generic/Gemini/GMOS/WASP103b_transit_from_generic_instrument/ sub-directory of the directory specified by the CASCADE_SAVE_PATH environment variable.

Documentation

The full documentation can be found online at:


https://jbouwman.gitlab.io/CASCADe/

The full documentation includes further descriptions of the pipeline, initialization files and the CASCADe modules.

Alternatively, the documentation can be found in the docs directory of the CASCADe GitLab repository. After cloning the git repository, the full documentation can be generated by executing in the in the docs directory the following commands:

make html
make latexpdf

The generated HTML and PDF files will be located in the build/html and build/latex sub-directories of the main documentation directory, respectively.

Acknowledgments

The CASCADe code was developed by Jeroen Bouwman, with contributions from the following collaborators:

Fred Lahuis (SRON)
Rene Gastaud (CEA)
Raphael Peralta (CEA)
Matthias Samland (MPIA)

This work was supported by the European Unions Horizon 2020 Research and Innovation Programme, under Grant Agreement N 776403.

Publications

https://ui.adsabs.harvard.edu/abs/2021AJ....161..284M/abstract

https://ui.adsabs.harvard.edu/abs/2021A%26A...646A.168C/abstract

https://ui.adsabs.harvard.edu/abs/2020ASPC..527..179L/abstract

https://exoplanet-talks.org/talk/271

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

CASCADe-spectroscopy-1.2.9.tar.gz (193.8 kB view hashes)

Uploaded Source

Built Distribution

CASCADe_spectroscopy-1.2.9-py3-none-any.whl (224.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page