Skip to main content

Search for Hints of Exoplanets fRom Lightcurves Of spaCe based seeKers

Project description

The SHERLOCK (Searching for Hints of Exoplanets fRom Lightcurves Of spaCe-based seeKers) PIPEline is a user-friendly pipeline, which minimizes the interaction of the user to the minimum when using data coming from Kepler or TESS missions. SHERLOCK makes use of previous well-known and well-tested codes which allow the exoplanets community to explore the public data from space-based missions without need of a deep knowledge of how the data are built and stored. In most of cases the user only needs to provide with a KOI-ID, EPIC-ID, TIC-ID or coordinates of the host star where wants to search for exoplanets.

Main Developers

Active: F.J. Pozuelos, M. Dévora

Additional contributors

A. Thuillier & L. García

Installation

The package can be installed from the PyPi repositories:

python3 -m pip install sherlockpipe

Launch

You can run SHERLOCK PIPEline as a standalone package by using:

python3 -m sherlockpipe --properties my_properties.yaml

You only need to provide a YAML file with any of the properties contained in the internal properties.yaml provided by the pipeline. The most important keys to be defined in your YAML file are those under the GLOBAL OBJECTS RUN SETUP and SECTOR OBJECTS RUN SETUP sections because they contain the object ids or files to be analysed in the execution. You'd need to fill at least one of those keys for the pipeline to do anything. If you still have any doubts please refer to the examples/properties directory

Updates

SHERLOCK uses third party data to know TOIs, KOIs, EPICs and to handle FFIs and the vetting process. This data gets frequently updated from the active missions and therefore SHERLOCK will perform better if the metadata gets refreshed. You can simply run:

python3 -m sherlockpipe.update

and SHERLOCK will download the dependencies. It will store a timestamp to remember the last time it was refreshed to prevent several unneeded calls. However, if you find that there are more updates and you need them now, you can call:

python3 -m sherlockpipe.update --force

and SHERLOCK will ignore the timestamps and perform the update process. In addition, you could be interested in wiping all the metadata and build it again. That's why you could execute:

python3 -m sherlockpipe.update --clean

This last command implies a force statement and the last executed time will be ignored too.

You can additionally let SHERLOCK refresh the OIs list before running your current execution by adding to the YAML file the next line:

UPDATE_OIS=True

Vetting

SHERLOCK PIPEline comes with a submodule to examine the most promising transit candidates found by any of its executions. This is done via LATTE. Please note that this feature is only enabled for TESS candidates. You should be able to execute the vetting by calling:

python3 -m sherlockpipe.vet --properties my_properties.yaml

Through that command you will run the vetting process for the given parameters within your provided YAML file. You could watch the generated results under $your_sherlock_object_results_dir/vetting directory. Please go to examples/vetting/ to learn how to inject the proper properties for the vetting process.

There is an additional simplified option which can be used to run the vetting. In case you are sure there is a candidate from the Sherlock results which matches your desired parameters, you can run

python3 -m sherlockpipe.vet --candidate ${theCandidateNumber}

from the sherlock results directory. This execution will automatically read the transit parameters from the Sherlock generated files.

Fitting

SHERLOCK PIPEline comes with another submodule to fit the most promising transit candidates found by any of its executions. This fit is done via ALLESFITTER code. By calling:

python3 -m sherlockpipe.fit --properties my_properties.yaml

you will run the fitting process for the given parameters within your provided YAML file. You could watch the generated results under $your_sherlock_object_results_dir/fit directory. Please go to examples/fitting/ to learn how to inject the proper properties for the fitting process.

There is an additional simplified option which can be used to run the fit. In case you are sure there is a candidate from the Sherlock results which matches your desired parameters, you can run

python3 -m sherlockpipe.fit --candidate ${theCandidateNumber}

from the sherlock results directory. This execution will automatically read the transit and star parameters from the Sherlock generated files.

SHERLOCK PIPEline Workflow

It is important to note that SHERLOCK PIPEline uses some csv files with TOIs, KOIs and EPIC IDs from the TESS, Kepler and K2 missions. Therefore your first execution of the pipeline might take longer because it will download the information.

Provisioning of light curve

The light curve for every input object needs to be obtained from its mission database. For this we use the high level API of Lightkurve, which enables the download of the desired light curves for TESS, Kepler and K2 missions. We also include Full Frame Images from the TESS mission by the usage of ELEANOR. We always use the PDCSAP signal from the ones provided by any of those two packages.

Pre-processing of light curve

In many cases we will find light curves which contain several systematics like noise, high dispersion beside the borders, intense periodicities caused by pulsators, fast rotators, etc. SHERLOCK PIPEline provides some methods to reduce these most important systematics.

Local noise reduction

For local noise, where very close measurements show high deviation from the local trend, we apply a Savitzky-Golay filter. This has proved a highly increment of the SNR of found transits. This feature can be disabled with a flag.

High RMS areas masking

Sometimes the spacecrafts have to perform reaction wheels momentum dumps by firing thrusters, sometimes there is high light scattering and sometimes the spacecraft can infer some jitter into the signal. For all of those systematics we found that in many cases the data from those regions should be discarded. Thus, SHERLOCK PIPEline includes a binned RMS computation where bins whose RMS value is higher than a configurable factor multiplied by the median get automatically masked. This feature can be disabled with a flag.

Input time ranges masking

If enabled, this feature automatically disables High RMS areas masking for the assigned object. The user can input an array of time ranges to be masked into the original signal.

Detrend of intense periodicities

Our most common foes with high periodicities are fast-rotators, which infer a high sinusoidal-like trend in the PDCSAP signal. This is why SHERLOCK PIPEline includes an automatic intense periodicities detection and detrending during its preparation stage. This feature can be disabled with a flag.

Input period detrend

If enabled, this feature automatically disables Detrend of intense periodicities for the assigned object. The user can input a period to be used for an initial detrend of the original signal.

Custom curve preparation

You can even inject your own python code to perform a custom signal preparation task by implementing the CurvePreparer class that we provide. Then, inject your python file into the CUSTOM_PREPARER property and let SHERLOCK use your code (see example)!

Main execution (run)

After the preparation stage, the SHERLOCK PIPEline will execute what we call runs iteratively:

  • Several detrended fluxes with increasing window sizes will be extracted from the original PDCSAP light curve by using wotan
  • For each detrended flux, the TransitLeastSquares utility will be executed to find the most prominent transit.
  • The best transit is chosen from all the ones found in the detrended fluxes. Here we have three different algorithms for the selection:
    • Basic: Selects the best transit signal only based in the highest SNR value.
    • Border-correct: Selects the best transit signal based in a corrected SNR value. This correction is applied with a border-score factor, which is calculated from the found transits which overlap or are very close to empty-measurements areas in the signal.
    • Quorum: Including the same correction from the border-correct algorithm, quorum will also increase the SNR values when several detrended fluxes 'agree' about their transit selection (same ephemerids). The more detrended fluxes agree, the more SNR they get. This algorithm can be slightly tuned by changing the stregth or weight of every detrend vote. It is currently in testing stage and hasn't been used intensively.
    • Custom: You can also inject your own signal selection algorithm by implementing the SignalSelector class. See the example.
  • Measurements matching the chosen transit are masked in the original PDCSAP signal so they will not be found by subsequent runs.

Reporting

SHERLOCK PIPEline produces several information items under a new directory for every analysed object:

  • Object report log: The entire log of the object run is written here.
  • Most Promising Candidates log: A summary of the parameters of the best transits found for each run is written at the end of the object execution. Example content:
Listing most promising candidates for ID MIS_TIC 470381900_all:
Detrend no. Period  Duration  T0      SNR     SDE     FAP       Border_score  Matching OI   Semi-major axis   Habitability Zone   
1           2.5013  50.34     1816.69 13.30   14.95   0.000080  1.00          TOI 1696.01   0.02365           I                   
4           0.5245  29.65     1816.56 8.34    6.26    0.036255  1.00          nan           0.00835           I                   
5           0.6193  29.19     1816.43 8.76    6.57    0.019688  1.00          nan           0.00933           I                   
1           0.8111  29.04     1816.10 9.08    5.88    0.068667  0.88          nan           0.01116           I                   
2           1.0093  32.41     1817.05 8.80    5.59    nan       0.90          nan           0.01291           I                   
6           3.4035  45.05     1819.35 6.68    5.97    0.059784  1.00          nan           0.02904           I      
  • Runs directories: Containing png images of the detrended fluxes and their suggested transits. Example of one detrended flux transit selection image:

  • Light curve csv file: The original (before pre-processing) PDCSAP signal stored in three columns: #time, flux and flux_err. Example content:
#time,flux,flux_err
1816.0895073542242,0.9916135,0.024114653
1816.0908962630185,1.0232307,0.024185425
1816.0922851713472,1.0293404,0.024151148
1816.0936740796774,1.000998,0.024186047
1816.0950629880074,1.0168158,0.02415397
1816.0964518968017,1.0344968,0.024141008
1816.0978408051305,1.0061758,0.024101004
...
  • Candidates csv file: Containing the same information than the Most Promising Candidates log but in a csv format so it can be read by future additions to the pipeline like vetting or fitting endpoints.
  • Lomb-Scargle periodogram plot: Showing the period strengths. Example:

  • RMS masking plot: In case the High RMS masking pre-processing is enabled. Example:

  • Phase-folded period plot: In case auto-detrend or manual period detrend is enabled.

Dependencies

All the needed dependencies should be included by your pip installation of SHERLOCK. These are the Python libraries which are required for SHERLOCK to be run:

  • numpy: If you run into problems by installing numpy, it might be helpful to install the next packages (if you're under an Ubuntu distribution)
    • sudo apt-get install libblas-dev liblapack-dev
    • sudo apt-get install gfortran
  • cython (for lightkurve and pandas dependencies)
  • pandas
  • lightkurve
  • transitleastsquares
  • eleanor
  • wotan
  • matplotlib

The next libraries are required for SHERLOCK Explorer to be run:

  • plotly
  • colorama

Testing

SHERLOCK Pipeline comes with a light automated tests suite which can be executed with: python3 -m unittest sherlock_tests.py. This suite tests several points from the pipeline:

  • The construction of the Sherlock object.
  • The parameters setup of the Sherlock object.
  • The provisioning of objects of interest files.
  • Load and filtering of objects of interest.
  • Different kind of short Sherlock executions.

In case you want to test the entire SHERLOCK PIPEline functionality we encourage you to run some (or all) the manual examples. If so, please read the instructions provided there to execute them.

Integration

SHERLOCK integrates with several third party services. Some of them are listed below:

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sherlockpipe-0.13.4.tar.gz (5.8 MB view hashes)

Uploaded Source

Built Distribution

sherlockpipe-0.13.4-py3-none-any.whl (5.4 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page