No project description provided
Project description
EIR-auto-GP
EIR-auto-GP
: Automated genomic prediction (GP) using deep learning models with EIR.
WARNING: This project is in alpha phase. Expect backwards incompatible changes and API changes.
Overview
EIR-auto-GP is a comprehensive framework for genomic prediction (GP) tasks, built on top of the EIR deep learning framework. EIR-auto-GP streamlines the process of preparing data, training, and evaluating models on genomic data, automating much of the process from raw input files to results analysis. Key features include:
- Support for
.bed/.bim/.fam
PLINK files as input data. - Automated data processing and train/test splitting.
- Takes care of launching a configurable number of deep learning training runs.
- SNP-based feature selection based on GWAS, deep learning-based attributions, and a combination of both.
- Ensemble prediction from multiple training runs.
- Analysis and visualization of results.
Installation
First, ensure that plink2 is installed and available in your PATH
.
Then, install EIR-auto-GP
using pip
:
pip install eir-auto-gp
Usage
Please refer to the Documentation for examples and information.
Workflow
The rough workflow can be visualized as follows:
- Data processing: EIR-auto-GP processes the input
.bed/.bim/.fam
PLINK files and.csv
label file, preparing the data for model training and evaluation. - Train/test split: The processed data is automatically split into training and testing sets, with the option of manually specifying splits.
- Training: Configurable number of training runs are set up and executed using EIR's deep learning models.
- SNP feature selection: GWAS based feature selection, deep learning-based feature selection with Bayesian optimization, and mixed strategies are supported.
- Test set prediction: Predictions are made on the test set using all training run folds.
- Ensemble prediction: An ensemble prediction is created from the individual predictions.
- Results analysis: Performance metrics, visualizations, and analysis are generated to assess the model's performance.
Citation
If you use EIR-auto-GP
in a scientific publication, we would appreciate if you could use the following citation:
@article{sigurdsson2021deep,
title={Deep integrative models for large-scale human genomics},
author={Sigurdsson, Arnor Ingi and Westergaard, David and Winther, Ole and Lund, Ole and Brunak, S{\o}ren and Vilhjalmsson, Bjarni J and Rasmussen, Simon},
journal={bioRxiv},
year={2021},
publisher={Cold Spring Harbor Laboratory}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for eir_auto_gp-0.0.4a0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6da8b3fb20b9e09e80b10c02a5797b3d072fe20a90f9e6c05af9ebfb1be5617c |
|
MD5 | e984c7fe15f7f8ebc858ad3276683fb0 |
|
BLAKE2b-256 | 15225a07946fceddb5c7c168b735fd3a61834f49685bb76d16c703d2045d3fdc |