CASCADe : Calibration of trAnsit Spectroscopy using CAusal Data
Project description
CASCADe : Calibration of trAnsit Spectroscopy using CAusal Data
At present several thousand transiting exoplanet systems have been discovered. For relatively few systems, however, a spectro-photometric characterization of the planetary atmospheres could be performed due to the tiny photometric signatures of the atmospheres and the large systematic noise introduced by the used instruments or the earth atmosphere. Several methods have been developed to deal with instrument and atmospheric noise. These methods include high precision calibration and modeling of the instruments, modeling of the noise using methods like principle component analysis or Gaussian processes and the simultaneous observations of many reference stars. Though significant progress has been made, most of these methods have drawbacks as they either have to make too many assumptions or do not fully utilize all information available in the data to negate the noise terms.
The CASCADe package, developed within the EC Horizons 2020 project Exoplanets A , implements a full spectroscopic pipeline for HST/WFC3 and Spitzer/IRS spectroscopic timeseries observations as well as lightcurve calibration and fitting functionality. The CASCADe project implements a novel “data driven” method, pioneered by Schoelkopf et al (2016) utilizing the causal connections within a data set, and uses this to calibrate the spectral timeseries data of a transiting systems, observed in a single object mode. The current code has been tested successfully on spectroscopic data obtained with the Spitzer and HST observatories, as well as JWST MIRI simulations.
Installing CASCADe
The easiest way to install the CASCADe package is to create an Anaconda environment, download the distribution from PyPi, and install the package in the designated Anaconda environment with the following commands:
conda create --name cascade python=3.9 ipython
conda activate cascade
pip install CASCADe-spectroscopy
This will install all code and scripts you need for the package to work.
NOTE: CASCADe The batman package is only guaranteed to work when using numpy version 1.22.1, and with this numpy version one should install version 0.53.1 of the numba package.
To update an existing installation to the latest release, simply use the -U option with pip:
conda activate cascade
pip install CASCADe-spectroscopy -U
Installing the required CASCADe data and examples
All necessary data needed by CASCADe for it to work properly, such as calibration files for the different spectroscopic instruments of HST and Spitzer and configuration templates, need be downloaded from the GitLab repository before running the code. To initialize the data download and setup of the CASCADe data storage one can use the following bash command:
setup_cascade.py
or alternatively from within the python interpreter:
from cascade.initialize import initialize_cascade
initialize_cascade()
The additional downloaded data also includes examples and observational data to try out the CASCADe package, which are explained below.
NOTE: The data files will be downloaded by default to a
CASCADeSTORAGE/
directory in the users home directory. If a different location is preferred, please read the section on how to set the CASCADe environment variables first.
Installing alternatives for the CASCADe package
The CASCADe code can also be downloaded from GitLab directly by either using git or pip. To download and install with a single command using pip, type in a terminal the following command
pip install git+git://gitlab.com/jbouwman/CASCADe.git@master
which will download the latest version. For other releases replace the master
branch with one of the available releases on GitLab. Alternatively, one can first
clone the repository and then install, either using the HTTPS protocal:
git clone https://gitlab.com/jbouwman/CASCADe.git
or clone using SSH:
git clone git@gitlab.com:jbouwman/CASCADe.git
Both commands will download a copy of the files in a folder named after the project's name. You can then navigate to the directory and start working on it locally. After accessing the root folder from terminal, type
pip install .
to install the package.
In case one is installing CASCADe directly from GitLab, and one is using Anaconda, make sure a cascade environment is created and activated before using our package. For convenience, in the CASCADe main package directory an environment.yml can be found. You can use this yml file to create or update the cascade Anaconda environment. If you not already had created an cascade environment execute the following command:
conda env create -f environment.yml
In case you already have an cascade environment, you can update the necessary packages with the following command (also use this after updating CASCADe itself):
conda env update -f environment.yml
Make sure the CASCADe package is in your path.
You can either set a PYTHONPATH
environment variable pointing to the location of the
CASCADe package on your system, or when using
anaconda with the following command:
conda develop <path_to_the_CASCADe_package_on_your_system>/CASCADe
Using CASCADe
The CASCADe distribution comes with a few
working examples and data sets which can be found in the examples directory of
the CASCADe distribution on GitLab, and which
should have been copied to the storage directory specified by the
CASCADE_STORAGE_PATH
environment variable. All needed parameters for running
the code are defined by initialization files and a few environment variables to
be set by the user.
CASCADe Environment Variables
CASCADe uses the following environment variables:
CASCADE_STORAGE_PATH
Default path to all functional data needed by the CASCADe package.
CASCADE_DATA_PATH
Default path to all observational data to be analyzed with CASCADe.
CASCADE_SAVE_PATH
Default path to where CASCADe saves all pipeline output.
CASCADE_INITIALIZATION_FILE_PATH:
Default path to CASCADe pipeline initialization files.
CASCADE_SCRIPTS_PATH
Default path to the CASCADe data reduction pipeline scripts for each
observation.
CASCADE_LOG_PATH:
Default path to the saved CASCADe log files.
CASCADE_WARNINGS
Switch to show or not show warnings. Can either be 'on' or 'off'
These environment variables control where CASCADe
searches and stores all required data. In case the environment variables are not
set by the user, CASCADe uses default values
defined in the initialize
module. The main environment variable is
CASCADE_STORAGE_PATH
. This environment variable, if not set by the user has a
default value of <user_home_directory>/CASCADeSTORAGE/
. All other path settings
stored in the other environment variables are set relative to this, so in principle
the user only has to set the CASCADE_STORAGE_PATH
variable to the preferred location,
such as a directory on a larger file system rather then in the user home directory.
The CASCADE_DATA_PATH
, CASCADE_SAVE_PATH
, CASCADE_INITIALIZATION_FILE_PATH
,
CASCADE_SCRIPTS_PATH
and CASCADE_LOG_PATH
are set by default to
data/
, results/
, init_files/
, scripts/
and logs/
, respectively, all
relative to the path defined by CASCADE_STORAGE_PATH
.
The CASCADE_WARNINGS
variable is by default set to "on"
to show possible
warnings generated by the CASCADe code.
CASCADe Script Examples
The distribution comes with three examples demonstrating the use of the
CASCADe package. These use cases, together
with the observational data can be found in the 'examples/' sub-directory of the
GitLab repository, and should be installed (see above) in the directory defined
by the CASCADE_STORAGE_PATH
and other environment variables. The examples cover
both HST and Spitzer data as well as an example for a Generic spectroscopic
dataset, in this case an observation with the GMOS instrument on Gemini.
Example 1a: Extracting a Spectroscopic Timeseries from HST WFC3 spectral images.
In this example we demonstrate how extract a spectral timeseries from spectral images.
For this we use observations of WASP-19b with the Wide Field Camera 3 (WFC3) on
board the HST observatory. As these observations were made in "Staring Mode"
,
we use as a start the flt
data product. The spectral images can be found in the
HST/WFC3/WASP-19b_ibh715/SPECTRAL_IMAGES
sub-directory of the main data directory
specified by the CASCADE_DATA_PATH
environment variable.
The pipeline script for this example can be found in the HST/WFC3/
sub-directory
of the scripts directory as specified by the CASCADE_SCRIPTS_PATH
environment
variable. To run this example, execute the following script:
python3 run_CASCADe_WASP19b_extract_timeseries.py
The individual steps in this script are commented and a detailed explanation on the
reduction steps can be found in the CASCADe
documentation (see below). Note that while running the script, interactive plots are
opened which the user needs to close before the script continues. One can prevent plots
from opening by un-commenting the line matplotlib.use('AGG')
in the python script.
The initialization files for this example can be found in the
HST/WFC3/WASP-19b_ibh715_example
sub-directory of the directory specified by the
CASCADE_INITIALIZATION_FILE_PATH
environment variable.
The cascade_WASP19b_object.ini
contains all parameters specifying the target star
and planet. The ephemeris and orbital period given in the .ini
file are used
to calculated an orbital phase which is attached to the individual spectra in the
final timeseries. In case of HST WFC3 spectra, as is the case here, the stellar
parameters are used to calculated an expected spectrum, which is then used to
determine a global shift of the spectrum in the wavelength direction. It is,
therefore, important that the effective temperature
, logg
and metallicity
parameters are close as possible to the correct values of the system which the
observations are being analyzed. All other parameters controlling the behavior of
CASCADe are given in the
cascade_WASP19b_extract_timeseries.ini
initialization file. The parameters are
grouped different sections in a logical way, such as parameters controlling the
processing steps, describing the observations and data and overall behavior of
the code. Detailed information on each of these parameters can be found in the
documentation (see below.)
Executing the script results in the extraction of the individual (1d) spectra.
These are stored as fits files in the sub-directory SPECTRA
at the same level
as the SPECTRAL_IMAGES
directory containing the spectral images. Two types of
files are written: COE
, which are the CASCADe
Optimal Extraction
spectra and CAE
, which are the
CASCADe Aperture Extraction
spectra, produced
using respectively, optimal extraction or an extraction aperture. For further details
we refer to the documentation. Next to the spectral fits files, also several
diagnostic plots are produced which can be found in the
HST/WFC3/WASP-19b_ibh715_transit_output_from_extract_timeseries
sub-directory
of the directory specified by the CASCADE_SAVE_PATH
environment variable.
Example 1b: Calibrating a HST WFC3 Spectroscopic Timeseries and extracting a transit spectrum.
After creating a spectral timeseries from the spectral images in Example 1a,
we can proceed with the calibration of the spectral lightcurves and deriving the
planetary spectrum. To demonstrate how to use
CASCADe for this, we will take the extracted
COE
spectra of the WASP 19b observation from Example 1a and proceed to
characterize the systematics and extract the transit spectrum. To run Example 1b,
execute the following script:
python3 run_CASCADe_WASP19b_calibrate_planet_spectrum.py
As with the first example, the individual steps in this script are commented and
a detailed explanation on the reduction steps can be found in the
CASCADe documentation (see below).
The initialization files for this example can be found in the same directory as
the initialization files for example 1a. We use again the
cascade_WASP19b_object.ini
file to specify the target star and orbital
parameters. In contrast to the previous example, all system parameters specified
in this initialization file are relevant.
NOTE: The current CASCADe version only fits for the transit depth. All other system parameters such as the ephemeris, period, inclination and semi-major axis are fixed to the values specified in the initialization file.
All other parameters controlling the behavior of
the CASCADe pipline are given in the
cascade_WASP19b_calibrate_planet_spectrum.ini
initialization file. A detailed
explanation of the control parameters is given in the documentation.
The resulting transit spectrum and diagnostic plots are stored in the
HST/WFC3/WASP-19b_ibh715_transit_from_hst_wfc3_spectra
sub-directory of the
directory specified by CASCADE_SAVE_PATH
environment variable. The calibrated spectrum
of WASP 19b is stored in the WASP-19b_ibh715_bootstrapped_exoplanet_spectrum.fits
and
the derived systematics for this dataset in WASP-19b_ibh715_bootstrapped_systematics_model.fits
Example 2: Calibrating a Spitzer/IRS Spectroscopic Timeseries and extracting a transit spectrum.
The CASCADe package can not only calibrate
observations with the WFC3 instrument onboard HST, but can also handle transit
spectroscopy observations with the IRS instrument onboard the Spitzer Space
Observatory. As an example, we analyze Spitzer/IRS observations of an eclipse
of HD189733b, using the with the CASCADe
package pre-extracted COE
spectral data product. The data can be found
in the SPITZER/IRS/HD189733b_AOR23439616/SPECTRA/
sub-directory of the main
data directory specified by the CASCADE_DATA_PATH
environment variable.
The pipeline script for this example can be found in the SPITZER/IRS/
sub-directory
of the scripts directory as specified by the CASCADE_SCRIPTS_PATH
environment
variable. To run Example 2, execute the following script:
python3 run_CASCADe_HD189733b_calibrate_planet_spectrum.py
The pipeline steps used in this example are identical to the ones of Example 1b.
The initialization files for example 2 can be found in the
SPITZER/IRS/HD189733b_AOR23439616_example
sub-directory of the directory
specified by the CASCADE_INITIALIZATION_FILE_PATH
environment variable.
Similar to the first example, the cascade_HD189733b_object.ini
file contains
all parameters specifying the target star and orbital parameters, while the
cascade_HD189733b_calibrate_planet_spectrum.ini
initialization file specifies
all other parameters controlling the behavior of the
CASCADe pipeline. The HD189733b eclipse
spectrum and diagnostic plots are stored in the
SPITZER/IRS/HD189733b_AOR23439616_eclipse_from_spitzer_irs_spectra
sub-directory
of the directory specified by the CASCADE_SAVE_PATH
environment variable.
Example 3: Calibrating a GEMINI/GMOS Spectroscopic Timeseries and extracting a transit spectrum.
As a final example we show how to use CASCADe for spectral timeseries extracted with another software package for a generic instrument. Though spectral extraction from spectral images or cubes is currently only implemented for the WFC3 instrument of HST and the IRS instrument of Spitzer, the calibration of spectral lightcurves and derivation of the planetary spectrum can be performed for any generic spectroscopic timeseries. The previous examples showed how to use CASCADe with HST and Spitzer observations. In this example we use an observation of WASP-103b with the GMOS instrument installed at the Gemini telescope (See Lendl et al 2017, A&A 606).
The spectral timeseries data for this example is located in the
Generic/Gemini/GMOS/WASP103b/SPECTRA/
sub-directory of the main data directory.
To be able to run this example we stored the GMOS spectra as fits files with an identical format as the spectral fits files created by CASCADe .
The pipeline script is located in the Generic/Gemini/GMOS/
sub-directory in
the scripts directory.
To run this example, execute the following script:
python3 run_CASCADe_WASP103b_calibrate_planet_spectrum.py
The initialization files for example 2 can be found in the
Generic/Gemini/GMOS/WASP-103b_example/
sub-directory of the directory specified
by the CASCADE_INITIALIZATION_FILE_PATH
environment variable. Similar to
the other examples, the cascade_WASP103b_object.ini
initialization file contains
all parameters defining the system, and the cascade_WASP103b_calibrate_planet_spectrum.ini
file contains all other
parameters needed by the CASCADe pipeline.
The WASP-103 b transit spectrum and diagnostic plots are stored in the
Generic/Gemini/GMOS/WASP103b_transit_from_generic_instrument/
sub-directory of the directory specified by the CASCADE_SAVE_PATH
environment
variable.
Documentation
The full documentation can be found online at:
https://jbouwman.gitlab.io/CASCADe/
The full documentation includes further descriptions of the pipeline, initialization files and the CASCADe modules.
Alternatively, the documentation can be found in the docs
directory of the
CASCADe GitLab repository.
After cloning the git repository, the full documentation can be generated
by executing in the in the docs
directory the following commands:
make html
make latexpdf
The generated HTML and PDF files will be located in the build/html
and
build/latex
sub-directories of the main documentation directory, respectively.
Acknowledgments
The CASCADe code was developed by Jeroen Bouwman, with contributions from the following collaborators:
Fred Lahuis (SRON)
Rene Gastaud (CEA)
Raphael Peralta (CEA)
Matthias Samland (MPIA)
This work was supported by the European Unions Horizon 2020 Research and Innovation Programme, under Grant Agreement N 776403.
Publications
https://ui.adsabs.harvard.edu/abs/2021AJ....161..284M/abstract
https://ui.adsabs.harvard.edu/abs/2021A%26A...646A.168C/abstract
https://ui.adsabs.harvard.edu/abs/2020ASPC..527..179L/abstract
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file CASCADe-spectroscopy-1.1.17.tar.gz
.
File metadata
- Download URL: CASCADe-spectroscopy-1.1.17.tar.gz
- Upload date:
- Size: 190.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 06586c82e951c185a2c249cfa638b52ed062eb391604570401f513797f36b0e3 |
|
MD5 | 111d3ca1b4a19221677df578d936b0d1 |
|
BLAKE2b-256 | 2db9c5590df2bd810458e0eba6030d8a211f826d0926c8d5617102449df4e3bd |
File details
Details for the file CASCADe_spectroscopy-1.1.17-py3-none-any.whl
.
File metadata
- Download URL: CASCADe_spectroscopy-1.1.17-py3-none-any.whl
- Upload date:
- Size: 222.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 35f268021661824f5fa555912a7628264ad10a98ff80f3695de9926e2bcb43f2 |
|
MD5 | 743960c4beae7966ae4410028d701912 |
|
BLAKE2b-256 | 780659f9729b0edcbb686a2c2d018e85d6d5f26a50dfe12ba5492fc5f099193b |