extract orthologous maps (short: orthomap) from OrthoFinder output for query species

These details have not been verified by PyPI

Project links

Project description

oggmap

orthologous maps - evolutionary age index

oggmap is a python package to extract orthologous maps (short: orthomap or in other words the evolutionary age of a given orthologous group) from OrthoFinder or eggNOG results. Oggmap results (gene ages per orthologous group) can be further used to calculate and visualize weighted expression data (transcriptome evolutionary index) from scRNA sequencing objects.

Documentation

Online documentation can be found here.

Installing `oggmap`

More installation options can be found here.

oggmap installation using conda and pip

We recommend installing oggmap in an independent conda environment to avoid dependent software conflicts. Please make a new python environment for oggmap and install dependent libraries in it.

If you do not have a working installation of Python 3.8 (or later), consider installing Anaconda or Miniconda.

To create and activate the environment run:

$ git clone https://github.com/kullrich/oggmap.git
$ cd oggmap
$ conda env create --file environment.yml
$ conda activate oggmap_env

Then to install oggmap via PyPI:

$ pip install oggmap

Quick usage

Detailed tutorials how to use oggmap can be found here.

Update/download local ncbi taxonomic database:

The following command downloads or updates your local copy of the NCBI's taxonomy database (~300MB). The database is saved at ~/.etetoolkit/taxa.sqlite.

>>> from oggmap import ncbitax
>>> ncbitax.update_ncbi()

Step 1 - Get query species taxonomic lineage information:

You can query a species lineage information based on its name or its taxID. For example Danio rerio with taxID 7955:

>>> from oggmap import qlin
>>> qlin.get_qlin(q = 'Danio rerio')
>>> qlin.get_qlin(qt = '7955')

You can get the query species topology as a tree. For example for Danio rerio with taxID 7955:

>>> from oggmap import qlin
>>> query_topology = qlin.get_lineage_topo(qt = '7955')
>>> query_topology.write()

Step 2 - Get query species orthomap from OrthoFinder results:

The following code extracts the orthomap for Danio rerio based on pre-calculated OrthoFinder results and ensembl release-105:

OrthoFinder results (-S diamond_ultra_sens) using translated, longest-isoform coding sequences from ensembl release-105 have been archived and can be found here.

>>> from oggmap import datasets, of2orthomap
>>> datasets.ensembl105(datapath='.')
>>> query_orthomap = of2orthomap.get_orthomap(
...     seqname='Danio_rerio.GRCz11.cds.longest',
...     qt='7955',
...     sl='ensembl_105_orthofinder_species_list.tsv',
...     oc='ensembl_105_orthofinder_Orthogroups.GeneCount.tsv',
...     og='ensembl_105_orthofinder_Orthogroups.tsv',
...     out=None, quiet=False, continuity=True, overwrite=True)
>>> query_orthomap

Step 3 - Map OrthoFinder gene names and scRNA gene/transcript names:

The following code extracts the gene to transcript table for Danio rerio:

GTF file obtained from here.

>>> from oggmap import datasets, gtf2t2g
>>> gtf_file = datasets.zebrafish_gtf(datapath='.')
>>> query_species_t2g = gtf2t2g.parse_gtf(
...     gtf=gtf_file,
...     g=True, b=True, p=True, v=True, s=True, q=True)
>>> query_species_t2g

Import now, the scRNA dataset of the query species.

example: Danio rerio - http://tome.gs.washington.edu (Qui et al. 2022)

AnnData file can be found here.

>>> import scanpy as sc
>>> from oggmap import datasets, orthomap2tei
>>> # download zebrafish scRNA data here: https://doi.org/10.5281/zenodo.7243602
>>> # or download with datasets.qiu22_zebrafish(datapath='.')
>>> zebrafish_data = datasets.qiu22_zebrafish(datapath='.')
>>> zebrafish_data
>>> # check overlap of transcript table <gene_id> and scRNA data <var_names>
>>> orthomap2tei.geneset_overlap(zebrafish_data.var_names, query_species_t2g['gene_id'])

The replace_by helper function can be used to add a new column to the orthomap dataframe by matching e.g. gene isoform names and their corresponding gene names.

>>> # convert orthomap transcript IDs into GeneIDs and add them to orthomap
>>> query_orthomap['geneID'] = orthomap2tei.replace_by(
...    x_orig = query_orthomap['seqID'],
...    xmatch = query_species_t2g['transcript_id_version'],
...    xreplace = query_species_t2g['gene_id'])
>>> # check overlap of orthomap <geneID> and scRNA data
>>> orthomap2tei.geneset_overlap(zebrafish_data.var_names, query_orthomap['geneID'])

Step 4 - Get transcriptome evolutionary index (TEI) values and add them to scRNA dataset:

Since now the gene names correspond to each other in the orthomap and the scRNA adata object, one can calculate the transcriptome evolutionary index (TEI) and add them to the scRNA dataset (adata object).

>>> # add TEI values to existing adata object
>>> orthomap2tei.get_tei(adata=zebrafish_data,
...    gene_id=query_orthomap['geneID'],
...    gene_age=query_orthomap['PSnum'],
...    keep='min',
...    layer=None,
...    add=True,
...    obs_name='tei',
...    boot=False,
...    bt=10,
...    normalize_total=False,
...    log1p=False,
...    target_sum=1e6)

Step 5 - Downstream analysis

Once the gene age data has been added to the scRNA dataset, one can e.g. plot the corresponding transcriptome evolutionary index (TEI) values by any given observation pre-defined in the scRNA dataset.

Boxplot TEI per stage:

>>>sc.pl.violin(adata=zebrafish_data,
...             keys=['tei'],
...             groupby='stage',
...             rotation=90,
...             palette='Paired',
...             stripplot=False,
...             inner='box')

oggmap via Command Line

oggmap can also be used via the command line.

Command line documentation can be found here.

$ oggmap

usage: oggmap <sub-command>

oggmap

optional arguments:
  -h, --help            show this help message and exit

sub-commands:
  {cds2aa,gtf2t2g,ncbitax,of2orthomap,plaza2orthomap,qlin}
                        sub-commands help
    cds2aa              translate CDS to AA and optional retain longest
                        isoform <cds2aa -h>
    gtf2t2g             extracts transcript to gene table from GTF <gtf2t2g
                        -h>
    ncbitax             update local ncbi taxonomy database <ncbitax -h>
    of2orthomap         extract orthomap from OrthoFinder output for query
                        species <orthomap -h>
    plaza2orthomap      extract orthomap from PLAZA gene family data for query
                        species <of2orthomap -h>
    qlin                get query lineage based on ncbi taxonomy <qlin -h>

To retrieve e.g. the lineage information for Danio rerio run the following command:

$ oggmap qlin -q "Danio rerio"

Development Version

To work with the latest version on GitHub: clone the repository and cd into its root directory.

$ git clone kullrich/oggmap
$ cd oggmap

Install oggmap into your current python environment:

$ pip install -e .

Testing `oggmap`

oggmap has an extensive test suite which is run each time a new contribution is made to the repository. To run the test suite locally run:

$ pytest tests

Contributing Code

If you would like to contribute to oggmap, please file an issue so that one can establish a statement of need, avoid redundant work, and track progress on your contribution.

Before you do a pull request, you should always file an issue and make sure that someone from the oggmap developer team agrees that it's a problem, and is happy with your basic proposal for fixing it.

Once an issue has been filed and we've identified how to best orient your contribution with package development as a whole, fork the main repo, branch off a feature branch from master, commit and push your changes to your fork and submit a pull request for oggmap:master.

By contributing to this project, you agree to abide by the Code of Conduct terms.

Bug reports

Please post troubles or questions on the GitHub repository issue tracker. Also, please look at the closed issue pages. This might give an answer to your question.

Inquiry for collaboration or discussion

Please send e-mail to us if you want a discussion with us.

Principal code developer: Kristian Ullrich

E-mail address can be found here.

Code of Conduct - Participation guidelines

This repository adheres to the Contributor Covenant code of conduct for in any interactions you have within this project. (see Code of Conduct)

See also the policy against sexualized discrimination, harassment and violence for the Max Planck Society Code-of-Conduct.

By contributing to this project, you agree to abide by its terms.

References

see references here

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.0.1

Oct 30, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oggmap-0.0.1.tar.gz (58.9 kB view details)

Uploaded Oct 30, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

oggmap-0.0.1-py3-none-any.whl (61.4 kB view details)

Uploaded Oct 30, 2023 Python 3

File details

Details for the file oggmap-0.0.1.tar.gz.

File metadata

Download URL: oggmap-0.0.1.tar.gz
Upload date: Oct 30, 2023
Size: 58.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.8.18

File hashes

Hashes for oggmap-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`48b3eadb8e36a67b95ffa73880ba3425a830b0419ac03ed1cebe883fa7173c78`
MD5	`83b264490473b88649546cc1ba79cdf5`
BLAKE2b-256	`3d18215bc62445380b490f2c79d41722e0fa6782c46941c1caaca071d582177f`

See more details on using hashes here.

File details

Details for the file oggmap-0.0.1-py3-none-any.whl.

File metadata

Download URL: oggmap-0.0.1-py3-none-any.whl
Upload date: Oct 30, 2023
Size: 61.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.8.18

File hashes

Hashes for oggmap-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`64f5050bd618edc7755778255bbf7ea693dc74ed4831b8284e784a1f25a3bc69`
MD5	`c33bf8e0fabe561bb424379a33f995d0`
BLAKE2b-256	`4aa7aee781b48f2691054487c0205084251c3bb7f39f80dea57b1ef313925829`

See more details on using hashes here.

oggmap 0.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

oggmap

orthologous maps - evolutionary age index

Documentation

Installing oggmap

oggmap installation using conda and pip

Quick usage

Update/download local ncbi taxonomic database:

Step 1 - Get query species taxonomic lineage information:

Step 2 - Get query species orthomap from OrthoFinder results:

Step 3 - Map OrthoFinder gene names and scRNA gene/transcript names:

Step 4 - Get transcriptome evolutionary index (TEI) values and add them to scRNA dataset:

Step 5 - Downstream analysis

Boxplot TEI per stage:

oggmap via Command Line

Development Version

Testing oggmap

Contributing Code

Bug reports

Inquiry for collaboration or discussion

Code of Conduct - Participation guidelines

References

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Installing `oggmap`

Testing `oggmap`