GIP -- Gaussian Interaction Profiler
Project description
GIP
Gaussian Interaction Profiler
Contents
About
We introduce the Gaussian Interaction Profiler (GIP), a Gaussian mixture modeling-based clustering workflow for complexome profiling data. GIP assigns proteins to a set number of clusters by modeling the migration profile of each cluster. Using bootstrapping, GIP offers a way to prioritize actual interactors over spuriously comigrating proteins.
For a more complete description of the software and its applications, please refer to the manuscript (see here).
Installation
GIP is implemented as a flexible python package, which requires installing the package and its dependencies from the python package index (pip).
dependencies
pip
installation with pip ensures the dependencies are automatically installed alongside GIP.
pip install gip-bio
Installation from repository
git clone git@github.com:joerivstrien/gip-bio.git
cd gip-bio
pip install .
Usage
Input Data
GIP takes as input a single complexome profiling dataset, consisting of a series of abundance values that represent the fraction of the migration pattern for each detected protein.
- complexome profile
- protein annotation file
complexome profiling data
A single complexome profiling dataset, consisting of a matrix of expression/abundance data. These data should be provided as pandas DataFrame, with the index row containing protein identifiers. The GIP package contains a function (process_normalise.parse_profile) to load a complexome profile from a tab-separated text (tsv) file. An example of a file containing a complexome profile is available here
protein annotation file
To provide additional protein annotations aside from their identifiers to the output tables containing the GIP analysis results, a table can be provided containing these annotations. This table should be provided as a pandas.DataFrame, where the index contains protein identifiers that match those in the provided complexome profile. An example annotation file, in tab-separated (tsv) format is available here
Running a complete GIP analysis
from gip.main import main
import gip.process_normalise as prn
import pandas as pd
# parse complexome profile and protein annotation file
prof = prn.parse_profile('path/to/profile.tsv')
annot = pd.read_csv('path/to/annot_fn.tsv',sep='\t',index_col=0)
# set ratio of clusters relative to number of detected proteins
clust_ratio = 0.5
# to run a standard run, using 4 threads for the bootstrapping
gip_results = main(prof, clust_ratio, annot_df=annot, bs_processes=4)
Output
The main result of a GIP analysis is a set of clusters. The clusters are annotated with a variety of metrics that facilitate easy interpretation and prioritization of clusters likely corresponding to actual protein complexes. An overview of all resulting clusters with these features is provided in the output as a table ('clusttable'), which can optionally be saved to a file, using the "clusttable_fn" parameter.
Similarly, all clustered proteins are also annotated with a number of metrics reflecting their assigned cluster, abundance and the consistency with which they are part of their cluster. All protein members are provided as a separate table ('membertable), which can also optionally be saved to a file using the "membertable_fn" parameter.
For a complete description of the output from a GIP analysis please refer to the documentation of the main function here
Licence
GIP -- Gaussian Interaction Profiler
Copyright (C) 2023 Radboud University Medical Center
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as
published by the Free Software Foundation, either version 3 of the
License, or (at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.
Issues
If you have questions or encounter any problems or bugs, please report them in the issue channel.
Citing GIP
Publication Pending
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file gip-bio-0.1.0.tar.gz
.
File metadata
- Download URL: gip-bio-0.1.0.tar.gz
- Upload date:
- Size: 26.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 21766dd43de5b1764a2e29d3440cafde6966af71a7febd51584001495d527b40 |
|
MD5 | 9f8cc0de17daaea58cee2f146239a1cf |
|
BLAKE2b-256 | 0c3b1b297700b3b33b5dabf350e2269a31cd56e290e1e5fb41f60e96d96999cd |
File details
Details for the file gip_bio-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: gip_bio-0.1.0-py3-none-any.whl
- Upload date:
- Size: 26.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7af5a1eb1f8566b4ffa82c4411737fdbc8755ff28c0b47d75c8b1b565f03b830 |
|
MD5 | 3293026c691559b21d39139b6dcbe6ef |
|
BLAKE2b-256 | 9ada781b0a7bfa4289b282f75a2e96a17718e8dbcc44b3a05c71fe0b0eb67ec9 |