A Python Library for Gene–environment Interaction Analysis via Deep Learning
Project description
GENetLib
: A Python Library for Gene–environment Interaction Analysis via Deep Learning
GENetLib
is a Python library designed for gene-environment interaction analysis via neural network, addressing the analytical challenges in complex disease research.
This package is capable of handling a variety of input data types:
- Scalar input data
- Functional input data (or densely measured data)
This package also supports diverse output requirements:
- Continuous output data
- Binary output data
- Survival output data
By integrating minimax concave penalty (MCP) and $L_2$-norm regularization within a neural network estimation framework, GENetLib
offers an innovative solution for high-dimensional genetic data analysis. The framework is shown below.
We provide a web-based documentation which introduces the meaning of function parameters, the usage of functions, detailed information about methods, and gives examples for each. The web page is available at documentations. This package has been uploaded to PyPI with previous versions, and the web page is available at PyPI package. Users can also check releases to get historical versions.
Features
GENetLib
has the following features:
- Comprehensiveness: Supports a variety of input and output formats, enabling the construction of comprehensive neural network models for G-E interaction analysis.
- Flexibility: Offers a multitude of parameters allowing users to build models flexibly according to their specific needs.
- Functional data compatibility: Implements methods for functional data analysis (FDA) in Python, facilitating the processing of functional data with Python.
- Scalability: New methods for G-E interaction analysis via deep learning can be easily integrated into the system.
Installation
It is recommended to use pip
for installation:
pip install GENetLib
To get further information about installation and independencies, please move to installation instructions.
Quick Start
We start with the two basic functions scalar_ge
and func_ge
.
scalar_ge
scalar_ge
performs G-E interaction analysis via deep leanring when the input is scalar data.
from GENetLib.sim_data_scalar import sim_data_scalar
from GENetLib.scalar_ge import scalar_ge
# Get example data where input is scalar data and output is survival data
scalar_survival_linear = sim_data_scalar(rho_G = 0.25, rho_E = 0.3, dim_G = 500, dim_E = 5, n = 1500,
dim_E_Sparse = 2, ytype = 'Survival', n_inter = 30)
# Set up the ScalerGE model
scalar_ge_res = scalar_ge(scalar_survival_linear['data'], ytype = 'Survival', dim_G = 500, dim_E = 5,
haveGE = True, num_hidden_layers = 2, nodes_hidden_layer = [1000, 100],
Learning_Rate2 = 0.035, L2 = 0.1, Learning_Rate1 = 0.06, L = 0.09, Num_Epochs = 100,
t = 0.01, split_type = 0, ratio = [7, 3], important_feature = True, plot = True)
func_ge
func_ge
performs G-E interaction analysis via deep leanring when the input is functional data.
from GENetLib.sim_data_func import sim_data_func
from GENetLib.func_ge import func_ge
# Get example data where input is densely measured functional data and output is survival data
func_continuous = sim_data_func(n = 1500, m = 30, ytype = 'Continuous', seed = 123)
y = func_continuous['y']
z = func_continuous['z']
location = func_continuous['location']
X = func_continuous['X']
# Set up the FuncGE model
func_ge_res = func_ge(y, z, location, X, ytype = 'Continuous', btyepe = 'Bspline',
num_hidden_layers = 2, nodes_hidden_layer = [100,10], Learning_Rate2 = 0.035, L2 = 0.01,
Learning_Rate1 = 0.02, L = 0.01, Num_Epochs = 50, nbasis1 = 5, params1 = 4,
Bsplines = 5, norder1 = 4, model = None, split_type = 1, ratio = [3, 1, 1], plot_res = True)
For more information about the functions and methods, please check main functions.
Reference
The main referenced paper is:
- Wu, S., Xu, Y., Zhang, Q., & Ma, S. (2023). Gene–environment interaction analysis via deep learning. Genetic Epidemiology, 1–26. https://doi.org/10.1002/gepi.22518
- Ren, R., Fang, K., Zhang, Q., & Ma, S. (2023). FunctanSNP: an R package for functional analysis of dense SNP data (with interactions). Bioinformatics, 39(12), btad741. https://doi.org/10.1093/bioinformatics/btad741
Other referenced papers can be obtained in references.
License
GENetLib is licensed under the MIT License. See LICENSE for details.
Feedback
- Welcome to submit issues or pull requests.
- Send an email to Barry57@163.com to contact us.
- Thanks for all the supports! 👏
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file genetlib-1.1.4.tar.gz
.
File metadata
- Download URL: genetlib-1.1.4.tar.gz
- Upload date:
- Size: 23.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2784352728f168edada0ee5fa926902b4f26967ed66936a04f552db8e71f66a6 |
|
MD5 | 1d20790795480c8665665cf7e983d84a |
|
BLAKE2b-256 | 6425bda5c6779a0b6ed52d007b2da1626a97859b0bf0920bf23fda361016cf60 |
File details
Details for the file GENetLib-1.1.4-py3-none-any.whl
.
File metadata
- Download URL: GENetLib-1.1.4-py3-none-any.whl
- Upload date:
- Size: 33.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1d513696c272ea2182db258b29a5a898994183108a2e261479b3af481d7914f5 |
|
MD5 | 6e33d1ac39903fd9405ddffa11403308 |
|
BLAKE2b-256 | 867104aee2a986be4a60a99d5d3d139479ddc195d72b86b45730a50ed0f455ee |