Skip to main content

Search for Hints of Exoplanets fRom Lightcurves Of spaCe based seeKers

Project description

The SHERLOCK (Searching for Hints of Exoplanets fRom Lightcurves Of spaCe-based seeKers) PIPEline is a user-friendly pipeline, which minimizes the interaction of the user to the minimum when using data coming from Kepler or TESS missions. SHERLOCK makes use of previous well-known and well-tested codes which allow the exoplanets community to explore the public data from space-based missions without need of a deep knowledge of how the data are built and stored. In most of cases the user only needs to provide with a KOI-ID, EPIC-ID, TIC-ID or coordinates of the host star where wants to search for exoplanets.

Main Developers

Active: F.J. Pozuelos, M. Dévora

Additional contributors

A. Thuillier & L. García

Installation

The package can be installed from the PyPi repositories:

python3 -m pip install sherlockpipe

Launch

You can run SHERLOCK PIPEline as a standalone package by using:

python3 -m sherlockpipe --properties my_properties.yaml

You only need to provide a YAML file with any of the properties contained in the internal properties.yaml provided by the pipeline. The most important keys to be defined in your YAML file are those under the GLOBAL OBJECTS RUN SETUP and SECTOR OBJECTS RUN SETUP because they contain the object ids or files to be analysed in the execution. You'd need to fill at least one of those keys for the pipeline to do anything. If you still have any doubts please refer to the examples/properties directory

SHERLOCK PIPEline Workflow

It is important to note that SHERLOCK PIPEline uses some csv files with TOIs, KOIs and EPIC IDs from the TESS, Kepler and K2 missions. Therefore your first execution of the pipeline might take longer because it will download the information.

Provisioning of light curve

The light curve for every input object needs to be obtained from its mission database. For this we use the high level API of Lightkurve, which enables the download of the desired light curves for TESS, Kepler and K2 missions. We also include Full Frame Images from the TESS mission by the usage of ELEANOR. We always use the PDCSAP signal from the ones provided by any of those two packages.

Pre-processing of light curve

In many cases we will find light curves which contain several systematics like noise, high dispersion beside the borders, intense periodicities caused by pulsators, fast rotators, etc. SHERLOCK PIPEline provides some methods to reduce these most important systematics.

Local noise reduction

For local noise, where very close measurements show high deviation from the local trend, we apply a Savitzky-Golay filter. This has proved a highly increment of the SNR of found transits. This feature can be disabled with a flag.

High RMS areas masking

Sometimes the spacecrafts have to perform reaction wheels momentum dumps by firing thrusters, sometimes there is high light scattering and sometimes the spacecraft can infer some jitter into the signal. For all of those systematics we found that in many cases the data from those regions should be discarded. Thus, SHERLOCK PIPEline includes a binned RMS computation where bins whose RMS value is higher than a configurable factor multiplied by the median get automatically masked. This feature can be disabled with a flag.

Input time ranges masking

If enabled, this feature automatically disables High RMS areas masking for the assigned object. The user can input an array of time ranges to be masked into the original signal.

Detrend of intense periodicities

Our most common foes with high periodicities are fast-rotators, which infer a high sinusoidal-like trend in the PDCSAP signal. This is why SHERLOCK PIPEline includes an automatic intense periodicities detection and detrending during its preparation stage. This feature can be disabled with a flag.

Input period detrend

If enabled, this feature automatically disables Detrend of intense periodicities for the assigned object. The user can input a period to be used for an initial detrend of the original signal.

Main execution (run)

After the preparation stage, the SHERLOCK PIPEline will execute what we call runs iteratively:

  • Several detrended fluxes with increasing window sizes will be extracted from the original PDCSAP light curve by using wotan
  • For each detrended flux, the TransitLeastSquares utility will be executed to find the most prominent transit.
  • The best transit is chosen from all the ones found in the detrended fluxes. Here we have three different algorithms for the selection:
    • Basic: Selects the best transit signal only based in the highest SNR value.
    • Border-correct: Selects the best transit signal based in a corrected SNR value. This correction is applied with a border-score factor, which is calculated from the found transits which overlap or are very close to empty-measurements areas in the signal.
    • Quorum: Including the same correction from the border-correct algorithm, quorum will also increase the SNR values when several detrended fluxes 'agree' about their transit selection (same ephemerids). The more detrended fluxes agree, the more SNR they get. This algorithm can be slightly tuned by changing the stregth or weight of every detrend vote. It is currently in testing stage and hasn't been used intensively.
  • Measurements matching the chosen transit are masked in the original PDCSAP signal so they will not be found by subsequent runs.

Reporting

SHERLOCK PIPEline produces several information items under a new directory for every analysed object:

  • Object report log: The entire log of the object run is written here.
  • Most Promising Candidates log: A summary of the parameters of the best transits found for each run is written at the end of the object execution. Example content:
Listing most promising candidates for ID MIS_TIC 470381900_all:
Detrend no. Period  Duration  T0      SNR     SDE     FAP       Border_score  Matching OI   Semi-major axis   Habitability Zone   
1           2.5013  50.34     1816.69 13.30   14.95   0.000080  1.00          TOI 1696.01   0.02365           I                   
4           0.5245  29.65     1816.56 8.34    6.26    0.036255  1.00          nan           0.00835           I                   
5           0.6193  29.19     1816.43 8.76    6.57    0.019688  1.00          nan           0.00933           I                   
1           0.8111  29.04     1816.10 9.08    5.88    0.068667  0.88          nan           0.01116           I                   
2           1.0093  32.41     1817.05 8.80    5.59    nan       0.90          nan           0.01291           I                   
6           3.4035  45.05     1819.35 6.68    5.97    0.059784  1.00          nan           0.02904           I      
  • Runs directories: Containing png images of the detrended fluxes and their suggested transits. Example of one detrended flux transit selection image:

  • Light curve csv file: The original (before pre-processing) PDCSAP signal stored in three columns: #TBJD, flux and flux_err. Example content:
#TBJD,flux,flux_err
1816.0895073542242,0.9916135,0.024114653
1816.0908962630185,1.0232307,0.024185425
1816.0922851713472,1.0293404,0.024151148
1816.0936740796774,1.000998,0.024186047
1816.0950629880074,1.0168158,0.02415397
1816.0964518968017,1.0344968,0.024141008
1816.0978408051305,1.0061758,0.024101004
...
  • Candidates csv file: Containing the same information than the Most Promising Candidates log but in a csv format so it can be read by future additions to the pipeline like vetting or fitting endpoints.
  • Lomb-Scargle periodogram plot: Showing the period strengths. Example:

  • RMS masking plot: In case the High RMS masking pre-processing is enabled. Example:

  • Phase-folded period plot: In case auto-detrend or manual period detrend is enabled.

Dependencies

All the needed dependencies should be included by your pip installation of SHERLOCK. These are the Python libraries which are required for SHERLOCK to be run:

  • numpy: If you run into problems by installing numpy, it might be helpful to install the next packages (if you're under an Ubuntu distribution)
    • sudo apt-get install libblas-dev liblapack-dev
    • sudo apt-get install gfortran
  • cython (for lightkurve and pandas dependencies)
  • pandas
  • lightkurve
  • transitleastsquares
  • eleanor
  • wotan
  • matplotlib

The next libraries are required for SHERLOCK Explorer to be run:

  • plotly
  • colorama

Testing

SHERLOCK Pipeline comes with a light automated tests suite which can be executed with: python3 -m unittest sherlock_tests.py. This suite tests several points from the pipeline:

  • The construction of the Sherlock object.
  • The parameters setup of the Sherlock object.
  • The provisioning of objects of interest files.
  • Load and filtering of objects of interest.
  • Different kind of short Sherlock executions.

In case you want to test the entire SHERLOCK PIPEline functionality we encourage you to run some (or all) the manual examples. If so, please read the instructions provided there to execute them.

Integration

SHERLOCK integrates with several third party services. Some of them are listed below:

Project details


Release history Release notifications | RSS feed

This version

0.9.5

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sherlockpipe-0.9.5.tar.gz (23.5 kB view hashes)

Uploaded Source

Built Distribution

sherlockpipe-0.9.5-py3-none-any.whl (20.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page