Protein & Interactomic Graph Construction for Machine Learning
Project description
Documentation | Paper | Tutorials
Protein & Interactomic Graph Library
This package provides functionality for producing geometric representations of protein and RNA structures, and biological interaction networks. We provide compatibility with standard PyData formats, as well as graph objects designed for ease of use with popular deep learning libraries.
What's New?
- Protein Graph Creation from AlphaFold2!
- Protein Graph Visualisation!
- RNA Graph Construction from Dotbracket notation
- Protein - Protein Interaction Network Support & Structural Interactomics (Using AlphaFold2!)
- High and Low-level API for massive flexibility - create your own bespoke workflows!
Example usage
Creating a Protein Graph
Tutorial (Residue-level) | Tutorial - Atomic | Docs
from graphein.protein.config import ProteinGraphConfig
from graphein.protein.graphs import construct_graph
config = ProteinGraphConfig()
g = construct_graph(config=config, pdb_code="3eiy")
Creating a Protein Graph from the AlphaFold Protein Structure Database
from graphein.protein.config import ProteinGraphConfig
from graphein.protein.graphs import construct_graph
from graphein.protein.utils import download_alphafold_structure
config = ProteinGraphConfig()
fp = download_alphafold_structure("Q5VSL9", aligned_score=False)
g = construct_graph(config=config, pdb_path=fp)
Creating a Protein Mesh
from graphein.protein.config import ProteinMeshConfig
from graphein.protein.meshes import create_mesh
verts, faces, aux = create_mesh(pdb_code="3eiy", config=config)
Creating an RNA Graph
Tutorial | Docs
from graphein.rna.graphs import construct_rna_graph
# Build the graph from a dotbracket & optional sequence
rna = construct_rna_graph(dotbracket='..(((((..(((...)))..)))))...',
sequence='UUGGAGUACACAACCUGUACACUCUUUC')
Creating a Protein-Protein Interaction Graph
from graphein.ppi.config import PPIGraphConfig
from graphein.ppi.graphs import compute_ppi_graph
from graphein.ppi.edges import add_string_edges, add_biogrid_edges
config = PPIGraphConfig()
protein_list = ["CDC42", "CDK1", "KIF23", "PLK1", "RAC2", "RACGAP1", "RHOA", "RHOB"]
g = compute_ppi_graph(config=config,
protein_list=protein_list,
edge_construction_funcs=[add_string_edges, add_biogrid_edges]
)
Creating a Gene Regulatory Network Graph
from graphein.grn.config import GRNGraphConfig
from graphein.grn.graphs import compute_grn_graph
from graphein.grn.edges import add_regnetwork_edges, add_trrust_edges
config = GRNGraphConfig()
gene_list = ["AATF", "MYC", "USF1", "SP1", "TP53", "DUSP1"]
g = compute_grn_graph(
gene_list=gene_list,
edge_construction_funcs=[
partial(add_trrust_edges, trrust_filtering_funcs=config.trrust_config.filtering_functions),
partial(add_regnetwork_edges, regnetwork_filtering_funcs=config.regnetwork_config.filtering_functions),
],
)
Installation
The dev environment includes GPU Builds (CUDA 11.1) for each of the deep learning libraries integrated into graphein.
git clone https://www.github.com/a-r-j/graphein
cd graphein
conda create env -f environment-dev.yml
pip install -e .
A lighter install can be performed with:
git clone https://www.github.com/a-r-j/graphein
cd graphein
conda create env -f environment.yml
pip install -e .
Citing Graphein
Please consider citing graphein if it proves useful in your work.
@article{Jamasb2020,
doi = {10.1101/2020.07.15.204701},
url = {https://doi.org/10.1101/2020.07.15.204701},
year = {2020},
month = jul,
publisher = {Cold Spring Harbor Laboratory},
author = {Arian Rokkum Jamasb and Pietro Lio and Tom Blundell},
title = {Graphein - a Python Library for Geometric Deep Learning and Network Analysis on Protein Structures}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for graphein-1.0.1-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f790bc3513477206b72e752953fe83a8b8d9507c83b90896882b779a6f4b6470 |
|
MD5 | 1617f711fd7fc6eedccc524dbdccd171 |
|
BLAKE2b-256 | 80eb4bd4946bb40dbb31386c574d1508098f24c654799d31efe56663106d2511 |