A comprehensive toolkit for downloading and filtering Kepler DR25 FITS files from NASA's MAST archive

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

akira921x

These details have not been verified by PyPI

Project description

Kepler-Downloader-DR25

Research-Oriented Toolkit for Kepler Data Analysis

This toolkit is designed for researchers studying NASA Kepler Space Telescope data. It follows the data structure requirements of machine learning frameworks like ExoMiner and AstroNet while remaining flexible for diverse research applications beyond these specific frameworks.

Key Design Philosophy

Research-First Approach: Built to support various astronomical research workflows
ML-Ready: Compatible with ExoMiner, AstroNet, and other machine learning pipelines
Flexible Architecture: Not limited to specific frameworks - adaptable to custom research needs
Scientific Rigor: Maintains data integrity and validation standards required for publication-quality research

Overview

A comprehensive toolkit for downloading and filtering Kepler DR25 FITS files from NASA's MAST archive with intelligent mode detection, DVT validation, and universal filtering capabilities.

This project provides two main scripts and utility tools:

Main Scripts:

get-kepler-dr25.py - Main downloader with DVT filtering for ExoMiner mode and retry capabilities.
filter-get-kepler-dr25.py - Universal filter with mode detection, conversion, and download of missing files.

Utility Tools (in util/ folder):

util/rebuild_database.py - Rebuild SQLite database from existing filesystem (useful for recovery)
util/check_missing_kics.py - Compare CSV with downloaded KICs to find missing ones
util/generate_stats.py - Generate comprehensive statistics for completed jobs
util/test_health_report.py - Diagnostic tool to verify database and health report contents

Why This Toolkit?

The Challenge with Kepler Data:

Full TCE dataset: ~400+ GB disk space required
KOI dataset: ~200+ GB disk space required
Common problems without proper tooling:
- Missing KICs due to network timeouts
- Corrupted FITS files from incomplete downloads
- Database inconsistencies from concurrent writes
- No recovery mechanism for partial failures
- Manual tracking of thousands of files

What This Toolkit Solves:

Researchers studying exoplanets, stellar variability, or other phenomena in Kepler data need efficient, reliable tools that:

Prevent data loss: Redis buffering ensures zero database corruption even with network failures
Handle scale: Successfully manages datasets with 17,000+ KICs (tested with full TCE catalog)
Ensure completeness: Automatic retry mechanism and missing KIC detection
Support modern ML workflows: Compatible with ExoMiner, AstroNet, custom models
Provide verification: Health reports confirm data integrity and completeness

Key Features & Performance Metrics

Proven Performance:

5.5x faster than traditional bulk query approaches
99.9% success rate on 17,000+ KIC downloads (full TCE dataset)
Zero database corruption with Redis write-ahead buffering
Automatic recovery from network failures and timeouts

Core Capabilities:

Research-Ready Formats - Supports both ExoMiner/AstroNet structure and MAST standard
Universal filtering - Extract KOI subset from 400GB TCE data without re-downloading
Mode detection - Automatically detects and converts between formats
DVT validation - Ensures ML model compatibility with DVT file checking
Job-based organization - Each run creates a timestamped job directory for reproducibility
Parallel processing - 4-8 workers handle concurrent downloads efficiently
Health reporting - Comprehensive analysis confirms data completeness
Smart recovery - Automatically retries failed downloads and detects missing KICs

The standard Kepler light curve products available on the MAST archive are from the final Data Release 25 (DR25) processing.

Data Source and Attribution

NASA Kepler Mission Data

This toolkit downloads data from NASA's Kepler Space Telescope mission, which operated from 2009-2018 and revolutionized exoplanet science by discovering thousands of exoplanets. The data is hosted and distributed by:

MAST (Mikulski Archive for Space Telescopes) - NASA's data archive hosted at the Space Telescope Science Institute (STScI)
Data Release: DR25 (Data Release 25) - The final and most complete processing of Kepler data
Archive URL: https://archive.stsci.edu/kepler/

Data Usage and Citation

When using Kepler data downloaded with this toolkit, please:

Acknowledge NASA and the Kepler mission in your publications
Cite the appropriate Kepler papers:
- Kepler Mission: Borucki et al. (2010) Science, 327, 977
- Kepler Data Characteristics: Thompson et al. (2016) Kepler Data Release Notes (KSCI-19044-005)
- DR25 Release: Available at MAST Kepler archive
Include the standard acknowledgment:

"This research has made use of the NASA Exoplanet Archive, which is operated by the California Institute of Technology, under contract with the National Aeronautics and Space Administration under the Exoplanet Exploration Program."

Data Products

The toolkit downloads the following NASA Kepler data products:

Light Curve Files (*_llc.fits) - Time series photometry data
Data Validation Files (*_dvt.fits) - Transit model fits and diagnostics
Target Pixel Files (*_tpf.fits) - Pixel-level photometry (when requested)

Terms of Use

All Kepler data is in the public domain. There are no restrictions on the use of Kepler data. However, proper attribution and citations are expected as a matter of professional courtesy and scientific integrity.

For more information about Kepler data:

MAST Kepler Archive: https://archive.stsci.edu/kepler/
NASA Exoplanet Archive: https://exoplanetarchive.ipac.caltech.edu/
Kepler Mission Page: https://www.nasa.gov/mission_pages/kepler/main/index.html

Directory Structure

Kepler-Downloader-DR25/
├── input/
│   └── your_targets.csv
├── input_samples/
│   ├── cumulative_koi_2025.09.06_13.27.56.csv  # Kepler Objects of Interest
│   └── q1_q17_dr25_tce_2025.09.06_13.29.19.csv # Threshold Crossing Events
├── kepler_downloads/                # Default output directory
│   └── job-YYYYMMDD_HHMMSS/        # Timestamped job directory
│       ├── download_records.db      # SQLite database with all records
│       ├── health_check_report.txt  # Post-download health analysis
│       ├── reports/                  # DVT filtering and other reports
│       └── [Data directories based on mode]
├── util/                            # Utility scripts
│   ├── rebuild_database.py          # Rebuild database from filesystem
│   ├── check_missing_kics.py        # Find missing KICs
│   ├── generate_stats.py            # Generate job statistics
│   └── test_health_report.py        # Diagnostic tool
├── get-kepler-dr25.py              # Main downloader with DVT filtering
├── filter-get-kepler-dr25.py       # Universal filter with mode detection
├── setup.py                         # Package installation script
├── requirements.txt                 # Python dependencies
├── README.md                        # This file
├── QUICKSTART.md                   # Quick start guide
├── CHANGELOG.md                     # Version history
└── LICENSE                          # Apache 2.0 license

Output Formats

ExoMiner Format (Default)

kepler_downloads/job-*/
└── Kepler/
    └── XXXX/                  # First 4 digits of 9-digit KIC
        └── XXXXXXXXX/         # Full 9-digit KIC
            ├── *_llc.fits     # Light curve files
            ├── *_dvt.fits     # Data validation files (required)

Standard MAST Format (`--no-exominer` flag)

kepler_downloads/job-*/
└── mastDownload/
    └── Kepler/
        └── kplrXXXXXXXXX_lc/  # Light curve files
        └── kplrXXXXXXXXX_dv/  # Data validation and report files

Real-World Use Cases

Example: Filtering KOI from TCE Dataset

Problem: You downloaded the full TCE dataset (400+ GB, 17,230 KICs) but only need KOI data (8,214 KICs)

Traditional Approach: Re-download 200+ GB of KOI data, taking hours and risking incomplete downloads

With This Toolkit:

# Filter existing TCE data to extract KOI subset
python filter-get-kepler-dr25.py \
  --input-csv input_samples/cumulative_koi_2025.09.06_13.27.56.csv \
  --source-job kepler_downloads/job-with-tce-data

# Result: 7,141 KICs copied, 1,073 missing KICs automatically downloaded
# Time saved: Hours of redundant downloads
# Storage saved: 200+ GB of duplicate data

Security & Trust

This package implements comprehensive security measures:

Trusted Publishing: Cryptographically verified releases via GitHub OIDC
Attestations: PEP 740 compliant package attestations
Signed Packages: Sigstore keyless signing for supply chain security
SBOM: Software Bill of Materials for dependency transparency
Security Scanning: Automated vulnerability scanning in CI/CD

For detailed security information, see SECURITY.md.

Installation

Prerequisites

Python 3.7+ with pip installed

Redis Server (required for reliable database operations)

Install Redis:

# macOS
brew install redis && brew services start redis

# Ubuntu/Debian
sudo apt install redis-server && sudo systemctl start redis

# Docker
docker run -d -p 6379:6379 --name redis-kepler redis:latest

Installation Options

Option 1: Install from PyPI (Recommended)

# Install the package
pip install kepler-downloader-dr25

# Use command-line tools
kepler-download input/your_targets.csv
kepler-filter --input-csv input/kics.csv --source-job kepler_downloads/job-XXX
kepler-stats kepler_downloads/job-XXX

Option 2: Install from GitHub

# Clone and install
git clone https://github.com/akira921x/Kepler-Downloader-DR25.git
cd Kepler-Downloader-DR25
pip install -r requirements.txt

# Use scripts directly
python get-kepler-dr25.py input/your_targets.csv

Python Dependencies

Required packages (automatically installed with pip):

pandas - Data processing
astroquery - MAST archive interface
redis - Redis client for buffering
requests - HTTP requests
numpy - Numerical operations
tqdm - Progress bars

Quick Start

# Install from PyPI
pip install kepler-downloader-dr25

# Quick test with 3 targets
echo "006922244,007799349,011446443" > test.csv
kepler-download test.csv

# Download real datasets
kepler-download cumulative_koi.csv    # ~8,200 KOIs, ~200GB
kepler-download q1_q17_dr25_tce.csv   # ~17,000 TCEs, ~400GB

# Filter existing data (save time & storage)
kepler-filter --input-csv koi.csv --source-job kepler_downloads/job-XXX

See QUICKSTART.md for detailed quick start guide.

Usage

1. Downloading Data

Basic Download (ExoMiner Format - Default)

# Download from CSV files in input/ folder (ExoMiner format by default)
python get-kepler-dr25.py input/your_targets.csv

# Verbose mode (detailed output)
python get-kepler-dr25.py input/your_targets.csv --verbose

Standard MAST Format

# Download with Standard MAST structure (no DVT requirement)
python get-kepler-dr25.py input/your_targets.csv --no-exominer

# Strict DVT mode - skip KICs without DVT immediately (ExoMiner is default)
python get-kepler-dr25.py input/your_targets.csv --strict-dvt

# Backup KICs without DVT instead of deleting (ExoMiner is default)
python get-kepler-dr25.py input/your_targets.csv --backup-no-dvt

Advanced Options

# Custom configuration
python get-kepler-dr25.py input/your_targets.csv \
  --workers 8 \
  --batch-size 50 \
  --output-dir custom_output

# Retry failed downloads from a previous run
python get-kepler-dr25.py input/your_targets.csv --retry-failed

2. Filtering Existing Data

The universal filter script can process any CSV with KIC IDs and intelligently handle mode conversions.

Basic Filtering

# Filter existing job with KOI data
python filter-get-kepler-dr25.py \
  --input-csv input_samples/cumulative_koi_2025.09.06_13.27.56.csv \
  --source-job kepler_downloads/job-20250906_020543

# Use Standard format instead of default ExoMiner
python filter-get-kepler-dr25.py \
  --input-csv input/custom_kics.csv \
  --source-job kepler_downloads/job-20250906_020543 \
  --no-exominer

Mode Conversion

# Convert from Standard to ExoMiner format (ExoMiner is default target)
python filter-get-kepler-dr25.py \
  --input-csv input/kics.csv \
  --source-job kepler_downloads/standard_job \
  --force-mode  # Required when modes don't match

# Disable DVT validation for ExoMiner
python filter-get-kepler-dr25.py \
  --input-csv input/kics.csv \
  --source-job kepler_downloads/job-20250906 \
  --no-validate-dvt

Download Missing KICs

# Filter and download missing KICs from MAST
python filter-get-kepler-dr25.py \
  --input-csv input/kics.csv \
  --source-job kepler_downloads/job-20250906

# Skip downloading missing KICs
python filter-get-kepler-dr25.py \
  --input-csv input/kics.csv \
  --source-job kepler_downloads/job-20250906 \
  --no-download-missing

Command-Line Options

get-kepler-dr25.py

Option	Description	Required	Default
`csv_file`	Input CSV file with KIC IDs	Yes	-
`--output-dir`	Output directory	No	`kepler_downloads`
`--workers`	Number of parallel workers	No	`4`
`--batch-size`	KICs per batch	No	`50`
`--no-exominer`	Use Standard MAST format instead of ExoMiner	No	`False` (ExoMiner enabled)
`--strict-dvt`	Skip KICs without DVT immediately (ExoMiner mode)	No	`False`
`--backup-no-dvt`	Backup instead of delete no-DVT KICs (ExoMiner mode)	No	`False`
`--retry-failed`	Retry failed downloads from a previous run.	No	`False`
`--verbose`	Enable verbose logging	No	`False`

filter-get-kepler-dr25.py

Option	Description	Required	Default
`--input-csv`	Input CSV with KIC IDs	Yes	-
`--source-job`	Source job folder	Yes	-
`--no-exominer`	Use Standard MAST format instead of ExoMiner	No	`False` (ExoMiner enabled)
`--output-dir`	Output directory	No	Auto-generated timestamp
`--force-mode`	Force mode conversion even if incompatible	No	`False`
`--no-validate-dvt`	Disable DVT validation for ExoMiner mode	No	`False`
`--no-download-missing`	Skip downloading missing KICs from MAST	No	`False` (download enabled)
`--workers`	Number of parallel workers for downloads	No	`4`
`--batch-size`	Batch size for downloads	No	`50`
`--verbose`	Enable verbose logging	No	`False`

Mode Detection and Compatibility

The filter script automatically detects job modes:

ExoMiner/AstroNet Mode (Default)

Structure: Kepler/XXXX/XXXXXXXXX/
Requires DVT files for each KIC
Optimized for machine learning workflows
Compatible with ExoMiner, AstroNet, and similar ML frameworks
Supports custom research pipelines requiring structured data organization

Standard Mode

Structure: mastDownload/Kepler/kplr*_lc/
MAST's default organization
No DVT requirement
Compatible with traditional analysis tools

Mode Compatibility Rules

Same mode: Direct copy, no conversion needed
Different modes: Requires --force-mode flag
ExoMiner target: DVT validation enabled by default
Mode mismatch: Detailed report explains incompatibility

Health Reports

Both scripts generate comprehensive health reports:

Download Health Report

Download statistics (success/failure rates)
DVT coverage for ExoMiner mode
File inventory by type
Performance metrics
Failed KIC list with errors

Filter Health Report

Source job analysis (mode, structure, statistics)
Mode compatibility assessment
Processing statistics
DVT validation results (ExoMiner)
Recommendations for issues

DVT Filtering (ExoMiner Mode)

When using ExoMiner mode, the system handles DVT (Data Validation) files:

During Download (get-kepler-dr25.py --exominer):
- Downloads both LLC and DVT files
- Tracks DVT availability
- Post-download filtering removes KICs without DVT
- Optional backup with --backup-no-dvt
During Filtering (filter-get-kepler-dr25.py):
- Validates DVT presence for ExoMiner target
- Moves no-DVT KICs to backup
- Reports DVT coverage statistics

Performance

Expected Performance

Download Performance (with 4 workers, default settings):

Processing rate: ~50-60 KICs/minute
Small datasets (< 100 KICs): 2-5 minutes
Medium datasets (1,000 KICs): 20-30 minutes
KOI dataset (~8,200 KICs): 2.5-3 hours (~200 GB)
Full TCE dataset (~17,000 KICs): 5-6 hours (~400+ GB)

Why Traditional Methods Fail at Scale:

No automatic retry for network timeouts
Database corruption from concurrent writes
No progress tracking or recovery mechanism
Missing KIC detection requires manual verification
Network interruptions cause incomplete FITS files

Optimization Tips

Increase workers for faster downloads: --workers 8
Use larger batches: --batch-size 100
Ensure Redis is running for optimal performance
Use --strict-dvt to skip no-DVT KICs early
Run during off-peak hours for better MAST response

Database Features

Tables Created

download_records

KIC ID and success status
File counts (LLC and DVT)
DVT presence flag
Error messages
Removal reasons (ExoMiner mode)

file_inventory

Complete file catalog
File types and sizes
Download timestamps

removed_kics (ExoMiner mode)

KICs removed for lacking DVT
File statistics before removal
Removal timestamps

filter_operations (filter script)

Source and target modes
Operation type (copy/download)
Success status
DVT validation results

Utility Tools

Check Missing KICs

# Compare CSV with downloaded KICs
python util/check_missing_kics.py input/target.csv kepler_downloads/job-20250907_015817

# Output: Creates missing_kics_job-20250907_015817.csv

Generate Statistics

# Generate comprehensive statistics for a job
python util/generate_stats.py kepler_downloads/job-20250907_015817

# Export statistics to CSV
python util/generate_stats.py kepler_downloads/job-20250907_015817 --export stats.csv

Rebuild Database

# Rebuild database from filesystem (useful for recovery)
python util/rebuild_database.py kepler_downloads/job-20250907_015817

Test Health Report

# Verify database and health report contents
python util/test_health_report.py

Troubleshooting

Redis Connection Issues

# Check Redis status
redis-cli ping  # Should return PONG

# Start Redis
brew services start redis  # macOS
sudo systemctl start redis  # Linux

Database Shows All Zeros (Old Downloads)

This was a known bug (missing conn.commit()) that has been fixed. For existing downloads:

# Rebuild the database from filesystem
python util/rebuild_database.py kepler_downloads/job-YYYYMMDD_HHMMSS

# This will scan all FITS files and recreate the database

Mode Incompatibility

Check health report for details
Use --force-mode to override (use cautiously)
Consider target mode requirements

DVT Missing (ExoMiner)

Some KICs don't have DVT files in MAST
Use --backup-no-dvt to preserve data
Consider Standard mode for analysis

Download Failures

Check network connectivity
Verify KIC exists in MAST
Review error messages in health report
Retry with --retry-failed

Related Projects

Machine Learning Frameworks for Exoplanet Detection

ExoMiner - NASA's deep learning model for exoplanet detection
AstroNet - Google's neural network for identifying exoplanets

This toolkit provides data in formats compatible with these frameworks while maintaining flexibility for custom research applications.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

Development Setup

# Clone the repository
git clone https://github.com/akira921x/Kepler-Downloader-DR25.git
cd Kepler-Downloader-DR25

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install in development mode
pip install -e .

# Run tests (if available)
python -m pytest tests/

Version History

See CHANGELOG.md for a detailed version history.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Acknowledgments

NASA Kepler Mission Team
MAST Archive at STScI
ExoMiner and AstroNet teams for ML framework specifications
Open source community for invaluable tools and libraries

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

akira921x

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.5.0

Sep 8, 2025

1.4.0

Sep 7, 2025

1.3.0

Sep 7, 2025

1.2.7

Sep 7, 2025

1.2.6

Sep 7, 2025

1.2.4

Sep 7, 2025

1.2.3

Sep 7, 2025

This version

1.2.2

Sep 7, 2025

1.2.1

Sep 7, 2025

1.2.0

Sep 7, 2025

1.1.8

Sep 7, 2025

1.1.7

Sep 7, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kepler_downloader_dr25-1.2.2.tar.gz (3.7 MB view details)

Uploaded Sep 7, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

kepler_downloader_dr25-1.2.2-py3-none-any.whl (41.4 kB view details)

Uploaded Sep 7, 2025 Python 3

File details

Details for the file kepler_downloader_dr25-1.2.2.tar.gz.

File metadata

Download URL: kepler_downloader_dr25-1.2.2.tar.gz
Upload date: Sep 7, 2025
Size: 3.7 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for kepler_downloader_dr25-1.2.2.tar.gz
Algorithm	Hash digest
SHA256	`791da09dbd51f137f59d85ffc26e33d287eb98eadbb3b83132b8addcf2c26fce`
MD5	`7b027e8c3039ea569ac00ada69e1e0de`
BLAKE2b-256	`a3985da5c8354ed549f438c72137980d2120309c1763223faf307714151a310f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for kepler_downloader_dr25-1.2.2.tar.gz:

Publisher: release.yml on akira921x/Kepler-Downloader-DR25

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: kepler_downloader_dr25-1.2.2.tar.gz
- Subject digest: 791da09dbd51f137f59d85ffc26e33d287eb98eadbb3b83132b8addcf2c26fce
- Sigstore transparency entry: 482445563
- Sigstore integration time: Sep 7, 2025
Source repository:
- Permalink: akira921x/Kepler-Downloader-DR25@994f40ccf1bacad44fa2bcd1f463ea292b29a87e
- Branch / Tag: refs/tags/v1.2.2
- Owner: https://github.com/akira921x
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@994f40ccf1bacad44fa2bcd1f463ea292b29a87e
- Trigger Event: release

File details

Details for the file kepler_downloader_dr25-1.2.2-py3-none-any.whl.

File metadata

Download URL: kepler_downloader_dr25-1.2.2-py3-none-any.whl
Upload date: Sep 7, 2025
Size: 41.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for kepler_downloader_dr25-1.2.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ca53bd26f9772e000a704b9e93a45418541aa2a89fb8695fbd8ef02f16e584d8`
MD5	`7e943c4901a7f546373a4e8cf2d17e99`
BLAKE2b-256	`cbb71764d7baaf970e8b5264cea45ddcd97852ee810cb8820f7ab5a98abd232b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for kepler_downloader_dr25-1.2.2-py3-none-any.whl:

Publisher: release.yml on akira921x/Kepler-Downloader-DR25

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: kepler_downloader_dr25-1.2.2-py3-none-any.whl
- Subject digest: ca53bd26f9772e000a704b9e93a45418541aa2a89fb8695fbd8ef02f16e584d8
- Sigstore transparency entry: 482445576
- Sigstore integration time: Sep 7, 2025
Source repository:
- Permalink: akira921x/Kepler-Downloader-DR25@994f40ccf1bacad44fa2bcd1f463ea292b29a87e
- Branch / Tag: refs/tags/v1.2.2
- Owner: https://github.com/akira921x
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@994f40ccf1bacad44fa2bcd1f463ea292b29a87e
- Trigger Event: release

kepler-downloader-dr25 1.2.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Kepler-Downloader-DR25

Research-Oriented Toolkit for Kepler Data Analysis

Key Design Philosophy

Overview

Why This Toolkit?

Key Features & Performance Metrics

Data Source and Attribution

NASA Kepler Mission Data

Data Usage and Citation

Data Products

Terms of Use

Directory Structure

Output Formats

ExoMiner Format (Default)

Standard MAST Format (--no-exominer flag)

Real-World Use Cases

Example: Filtering KOI from TCE Dataset

Security & Trust

Installation

Prerequisites

Installation Options

Option 1: Install from PyPI (Recommended)

Option 2: Install from GitHub

Python Dependencies

Quick Start

Usage

1. Downloading Data

Basic Download (ExoMiner Format - Default)

Standard MAST Format

Advanced Options

2. Filtering Existing Data

Basic Filtering

Mode Conversion

Download Missing KICs

Command-Line Options

get-kepler-dr25.py

filter-get-kepler-dr25.py

Mode Detection and Compatibility

ExoMiner/AstroNet Mode (Default)

Standard Mode

Mode Compatibility Rules

Health Reports

Download Health Report

Filter Health Report

DVT Filtering (ExoMiner Mode)

Performance

Expected Performance

Optimization Tips

Database Features

Tables Created

download_records

file_inventory

removed_kics (ExoMiner mode)

filter_operations (filter script)

Utility Tools

Check Missing KICs

Generate Statistics

Rebuild Database

Test Health Report

Troubleshooting

Redis Connection Issues

Database Shows All Zeros (Old Downloads)

Mode Incompatibility

DVT Missing (ExoMiner)

Download Failures

Related Projects

Machine Learning Frameworks for Exoplanet Detection

Contributing

Development Setup

Version History

Standard MAST Format (`--no-exominer` flag)