Skip to main content

OMAmo - orthology-based model organism selection

Project description

OMAMO: orthology-based model organism selection

workflow diagram

OMAMO is a tool that suggests the best model organism to study a biological process based on orthologous relationship between a species and human.

The user can consider several species as potential model organisms and the algorithm will rank them and report the output for a given biological process (searched as a GO term or a GO ID) is produced in the dataframe format.

Dependencies

Following Python packages are needed: numpy, matplotlib, pickle and pandas. Besides, you need to install pyOMA.

Pipeline

Firstly, download the OMA dataset:

wget  https://omabrowser.org/All/OmaServer.h5  -O data/OmaServer.h5  #caution: 94GB

Secondly, using the file data/oma-species.txt find the five-letter UniProt code for species of interest. For example, consider three species Dicdyostelium discodeium , Neurospora crassa and Schizosaccharomyces pombe. Their UniProt codes are DICDI, NEUCR and SCHPO, respectively.

Install omamo from the git checkout:

pip install <path_to_omamo.git>

Once the package is installed, you should be able to run omamo as a command. With omamo -h see the available options:

usage: omamo [-h] --db DB [--query QUERY] [--ic IC] [--h5-out H5_OUT] [--tsv-out TSV_OUT] --models MODELS [MODELS ...]

Run omamo for a set of model organisms

optional arguments:
  -h, --help            show this help message and exit
  --db DB               Path to the HDF5 database
  --query QUERY         Name of the Query species, defaults to HUMAN
  --ic IC               Path to the information content file (tsv format)
  --h5-out H5_OUT       Path to the HDF5 output file. If omitted, not stored in this format
  --tsv-out TSV_OUT     Path to the TSV output file. If omitted, not stored in this format
  --models MODELS [MODELS ...]
                        List of model species, or a path to a txt file with the model species

In order to create the omamo data for Dicdyostelium discodeium, Neurospora crassa and Schizosaccharomyces pombe, we would run omamo with the following parameters:

omamo --db OmaServer.h5 --query HUMAN --tsv-out omamo_output_df.csv --models  DICDI NEUCR SCHPO

You might face an error about OSError: ``OmaServer.h5.idx`` does not exist and pyoma.browser.db.DBConsistencyError: Suffix index for protein sequences is not available which you can ignore them.

Finally, the output data frame is ready as a TSV file omamo_output_df.csv. For example, for the GO ID of GO0000472, "endonucleolytic cleavage to generate mature 5'-end of SSU-rRNA", OMAMO provides the following ranking for potential model organisms:

head -n 1 omamo_output_df.csv > ranked_organisms.csv
awk '$1 == 472'  omamo_output_df.csv >> ranked_organisms.csv
cat ranked_organisms.csv


GOnr	Species	QuerySpeciesGenes	ModelSpeciesGenes NrOrthologs	FuncSim_Mean	FuncSim_Std	Score
472	DICDI	NOP9;TBL3;ABT1	  Q551Y5;Q7KWS8;esf2	          3  	0.9095	0.1567	2.7286
472	NEUCR	NOP9;TBL3	         nop9;pod-5	          2  	1.0000	0.0000	2.0000
472	SCHPO	NOP9;TBL3	         nop9;utp13	          2  	1.0000	0.0000	2.0000

OMAMO Website

You can also visit the OMAMO website, where you can browse biological processes to study in 50 unicellular species.

Change log

Version 0.2.1

  • store ic values in hdf5 database

Version 0.2.0

  • Overhaul and creating pip package

Version 0.0.1

  • Initial release

Citation

Alina Nicheperovich, Adrian M Altenhoff, Christophe Dessimoz, Sina Majidian, "OMAMO: orthology-based model organism selection", submitted to Bioinformatics journal, preprint.

License

OMAMO is a free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

OMAMO is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with OMAMO. If not, see http://www.gnu.org/licenses/.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omamo-0.2.1.tar.gz (304.7 kB view details)

Uploaded Source

Built Distribution

omamo-0.2.1-py3-none-any.whl (301.7 kB view details)

Uploaded Python 3

File details

Details for the file omamo-0.2.1.tar.gz.

File metadata

  • Download URL: omamo-0.2.1.tar.gz
  • Upload date:
  • Size: 304.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for omamo-0.2.1.tar.gz
Algorithm Hash digest
SHA256 edc16ff8496978879fca8aaf23a32ed90d0abc80e921cd9da679f07f64f2d925
MD5 e0f040fbf0011baaa3e014409f7d1919
BLAKE2b-256 1f8b2419a9946832ba4dde2c1745eb3e65e5e0e06d1f6f4b027d892de3afb3fd

See more details on using hashes here.

File details

Details for the file omamo-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: omamo-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 301.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for omamo-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d1ed4c4292d6af43c4b24422f43e34784d0e34f419dd970c7126be147f749a8c
MD5 459a1734ef7031c82c87ea4ff8502ae3
BLAKE2b-256 53907bfaa1cf3d4ac165d2c18d82000a2914e2ab4ecad4ba96e0876ad62f8e6d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page