Analyzing and linking genomics and phenomics experiments
Project description
[![Build Status](https://travis-ci.org/combogenomics/DuctApe.png)](https://travis-ci.org/combogenomics/DuctApe)
Analyzing and linking genomics and phenomics experiments
DuctApe is a python library with a series of command-line programs that will help bioinformaticians to analyze genomes AND phenomic experiments. The final purpose of the program is to combine the genomic informations (encoded as KEGG pathways) with the results of phenomic experiments (for now the Phenotype microarrays are supported) and highlight the genes that may be responsible for phenotypic variations. Several scientific python libraries are used to perform this tasks: all the data is stored in a portable database file to allow an easy exchange of results. The most complex computing tasks are speeded up by parallelization. There are three distinct programs with an expressive syntax to ensure flexibility in the analysis, while keeping it simple.
Future releases will also include a web interfaces.
Requirements
Biopython
Numpy
SciPy
matplotlib
scikits.learn
multiprocessing
networkx
PyYaml
Installation
Method 1 (missing dependencies will be downloaded)
python -m pip install DuctApe
Method 2 (the dependencies won’t be checked)
(Download and extract the tarball)
sudo python setup.py install
Method 3 (missing dependencies will be downloaded)
(Download and extract the tarball)
python -m pip install .
Method 4 (missing dependencies will be downloaded)
conda create -n ductape pip numpy scipy scikit-learn matplotlib biopython networkx blast
conda activate ductape (or source activate ductape)
python -m pip install DuctApe
Dependencies installation on Debian/Ubuntu
sudo apt-get install python-numpy python-scipy python-sklearn python-matplotlib python-biopython python-networkx ncbi-blast+
Note: when using pip, some of the dependencies may fail to be installed; use your package manager (such as apt-get in the above snippet) to manually install the problematic packages
Quick how-to
Three command line utilities will be installed
dape
Used to initialize the project and add/remove the organism
Reconstructs the whole KEGG metabolic map
Produces a series of explorable metabolic maps
Performs the combination of genomics and phenomics experiments
dgenome
Handles protein sequences for each organism of the project
For pangenomic studies it can build the pangenome using the BBH algorithm (user-defined pangenomes can be added)
The proteins are mapped to KEGG metabolic pathways
dphenome
Handles phenomic experiments for each organism of the project (Phenotype microarray experiments)
Performs zero-subtraction on the biolog plates
Calculates the growth parameters
Ranks the single experiments with a clusterization over growth parameters
The compounds from the phenomic experiments are mapped to KEGG
Examples
Single organism experiment
dape init (initializes the project)
dape add MyOrg (adds my organism using the ID MyOrg)
dgenome add MyOrg.faa MyOrg (adds the proteome of MyOrg)
dgenome add-ko MyOrg.tab (adds the output of KAAS, a KEGG mapper)
dgenome start (maps the proteome to KEGG)
dgenome stats (statistic and graphics)
dgenome export (exports the genomic data)
dphenome add MyOrg.csv MyOrg (adds the phenomic experiment, BIOLOG data)
dphenome zero (performs control subtraction)
dphenome start -n 4 (calculates the growth parameters and performs the clusterization, using 4 CPUs)
dphenome plot (plots the growth curves)
dphenome purge -d 3 keep-max (removes inconsistent replicas: keep the highest replicas when there is an activity index delta >= 3)
dphenome plot (plots only those curves that are not purged)
dphenome restore (restore the purged replicas)
dphenome stats
dphenome export
dape start (metabolic network analysis)
dape map (produces a series of explorable KEGG maps)
dape export (KEGG data is exported)
Mutant experiment
dape init
dape add MyOrg
dape add-mut -m MyOrg -k deletion MyMut (adds mutant MyMut, a deletion mutant of MyOrg)
dgenome add-dir MyFolder (adds the proteins files found in this directory)
dgenome add-ko MyOrg.tab
dgenome start
dphenome add-dir MyPhenomicFolder (adds the phenomic files found in this directory)
dphenome zero
dphenome start -n 4
dphenome purge -d 3 keep-max
dphenome plot
dape start
dape map MyMut (only the metabolic map of the mutant is produced)
Pangenomic experiment
dape init
dape add MyOrg
dape add MyOrg2
dape add MyOrg3
dgenome add-dir MyFolder
dgenome add-ko MyOrg.tab MyOrg2.tab MyOrg3.tab
dgenome start -n 4 (also performs pangenome creation using 4 CPUs)
dphenome add-dir MyPhenomicFolder
dphenome zero
dphenome start -n 4
dphenome purge -d 3 keep-max
dphenome plot
dape start
dape map (plots the maps for the whole pangenome)
More informations
Each program options and parameters can be queried adding -h
Contacts
This program has been developed by Marco Galardini, Department of evolutionary biology, University of Florence
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file DuctApe-0.18.2.tar.gz
.
File metadata
- Download URL: DuctApe-0.18.2.tar.gz
- Upload date:
- Size: 212.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.39.0 CPython/3.7.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bfa0c15c66d13d5f81b382a57bee9d2fdbd9efab86543fe66dff7775a4d199de |
|
MD5 | 0dc162c49d7fd916c8dbf94267652659 |
|
BLAKE2b-256 | 39b338ee1269daa54e9051499fdb9882c7fdda89eb18db1cb5f2580fbafab2d8 |