Generalizable Gene Self-Expressive Networks
Project description
GXN: Generalizable Gene Self-Expressive Networks
Description
In this work we introduce Generalizable Gene Self-Expressive Networks, as a new simple, interpretable, and predictive formalism to model gene networks. This package contains two methods, based respectively on ElasticNet and Orthogonal Matching Pursuit regression algorithms, that aim at inferring, assessing and tuning Generalizable Gene Self-Expressive Networks. This package also contains several tutorials that also help to evaluated the generalization capabilities of these new approaches using new internal measure on Three RNAseq datasets from complex eukaryotes, namely C. familiaris, R. norvegicus and H. sapiens.
GXN•OMP
GXN•OMP relies on the well-known Orthogonal Matching Pursuit algorithm that aims at solving a linear regression task subject to a sparsity constrain ensuring that only $d_0$ nonzero coefficients are used. More formally, GXN•OMP aims at solving the following objective function:
$$C_{\star,g}^* = ArgMin_{C_{\star,g}} || X_{\star,g} - X\cdot C_{\star,g} ||^2_2$$
Subject to:
$$|C_{\star,g}|_0 \leq d_0,$$
$$C_{g,g} =0 \quad \forall g \in {1, \dots, N},$$
$$C_{j,g} = 0 \quad \forall j \notin \Psi$$
To solve this task, OMP relies on a greedy forward feature selection method. At each step, the method selects the feature with the highest correlation with the current residual, then it updates the regression coefficients and recomputes the residual using an orthogonal projection on the subspace of the previously selected features. Moreover, an inner cross-validation step is used to select the parameter $d_0$ in a range between 0 and the hyper-parameter $d_0^{max}$ defining the maximal number of features. In practice, hyper-parameter $d_0^{max} = min(\delta \times |\Psi|, rank(X_{\star,\Psi}))$ is set as a fraction $\delta$ of the number of regulators $|\Psi|$ (or as the rank of matrix $X_{\star,\Psi}$, whenever this values is lower). Here we set $d_0^{max}=30$
GXN•EN
GXN•EN relies in the ElasticNet regression technique, that address the linear regression task using simultaneously $\ell_1$ and $\ell_2$ regularization. More formally, GXN•EN address the following objective function:
$C_{\star,g}^* = ArgMin_{C_{\star,g}}$ $\frac{1}{2D} \times || X_{\star,g} - X\cdot C_{\star,g} ||^2_2 + \alpha \rho$ $|| C_{\star,g} ||1$ + $\alpha/2\times(1-\rho)\times$ $|| C{\star,g} ||^2_2$
Subject to:
$$C_{g,g} =0 \quad \forall g \in {1, \dots, N},$$
$$C_{j,g} = 0 \quad \forall j \notin \Psi$$
- $X$ simply denotes the gene expression matrix, and $D$ the number of samples
- Internally the method evaluates $\rho \in {0.8,0.9,0.99,1}$
- $1/\epsilon=K_{\alpha}$ defines the number of $\alpha$ values that should be tested between $\alpha_{max} = \frac{max_{i\neq j} (| X_{\star,i}^\intercal \cdot X_{\star,j}| )}{n\rho}$ (for which the coefficients vector is null) and a value $\alpha_{min} = \epsilon \alpha_{max}$. Notice that 0<$\epsilon$<1).
Installation
pip install GXN
Authors
Sergio Peignier
License
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file GXN-0.0.30.tar.gz
.
File metadata
- Download URL: GXN-0.0.30.tar.gz
- Upload date:
- Size: 54.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6b3f5f24cfec8fcc47a4431794285cc665eda45a82df7a20efa5c26c4b1ed660 |
|
MD5 | 0426782a9c56af69f95c61b2da24e78c |
|
BLAKE2b-256 | bf8aa0d6834c7bd8d4e5d59b8cdbdc4fd574117e3f1a585bfb3953cf4fabb9b5 |
File details
Details for the file GXN-0.0.30-py3-none-any.whl
.
File metadata
- Download URL: GXN-0.0.30-py3-none-any.whl
- Upload date:
- Size: 55.6 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 95480530d348963a0b40540b4021a865068ae08727bf1e19c37fee13216048a1 |
|
MD5 | d90d377e72408fd4c6b87dd87068fc55 |
|
BLAKE2b-256 | b87e6ec8d5f291f62ae62dc2f518d92db9f4f8dc234710d6af6aab8c8fefe30b |