Skip to main content

Your all-inclusive package for aggregating and visualizing metagenomic BLAST results.

Project description

metagenompy

PyPI Tests

Your all-inclusive package for aggregating and visualizing metagenomic BLAST results.

Installation

$ pip install metagenompy

Usage

NCBI taxonomy as NetworkX object

The core of metagenompy is a taxonomy as a networkX object. This means that all your favorite algorithms work right out of the box.

import metagenompy
import networkx as nx


# load taxonomy
graph = metagenompy.generate_taxonomy_network(auto_download=True)

# print path from human to pineapple
for node in nx.shortest_path(graph.to_undirected(as_view=True), '9606', '4615'):
    print(node, graph.nodes[node])
## 9606 {'rank': 'species', 'authority': 'Homo sapiens Linnaeus, 1758', 'scientific_name': 'Homo sapiens', 'genbank_common_name': 'human', 'common_name': 'man'}
## 9605 {'rank': 'genus', 'authority': 'Homo Linnaeus, 1758', 'scientific_name': 'Homo', 'common_name': 'humans'}
## [..]
## 4614 {'rank': 'genus', 'authority': 'Ananas Mill., 1754', 'scientific_name': 'Ananas'}
## 4615 {'rank': 'species', 'authority': ['Ananas comosus (L.) Merr., 1917', 'Ananas lucidus Mill., 1754'], 'scientific_name': 'Ananas comosus', 'synonym': ['Ananas comosus var. comosus', 'Ananas lucidus'], 'genbank_common_name': 'pineapple'}

Easy transformation and visualization of taxonomic tree

Extract taxonomic entities of interest and visualize their relations:

import metagenompy
import matplotlib.pyplot as plt


# load and condense taxonomy to relevant ranks
graph = metagenompy.generate_taxonomy_network(auto_download=True)
metagenompy.condense_taxonomy(graph)

# highlight interesting nodes
graph_zoom = metagenompy.highlight_nodes(graph, [
    '9606',  # human
    '9685',  # cat
    '9615',  # dog
    '4615',  # pineapple
    '3747',  # strawberry
    '4113',  # potato
])

# visualize result
fig, ax = plt.subplots(figsize=(10, 10))
metagenompy.plot_network(graph_zoom, ax=ax, labels_kws=dict(font_size=10))
fig.tight_layout()
fig.savefig('taxonomy.pdf')

Summary statistics for BLAST results

After blasting your reads against a sequence database, generating summary reports using metagenompy is a blast.

import metagenompy
import pandas as pd


# read BLAST results file with columns 'qseqid' and 'staxids'
df_blast = metagenompy.load_example_dataset()
df = (df_blast.set_index('qseqid')['staxids']
              .str.split(';')
              .explode()
              .dropna()
              .reset_index()
              .rename(columns={'staxids': 'taxid'})
)

df.head()
##   qseqid    taxid
## 0  read1  1811693
## 1  read2   327160
## 2  read3      821
## 3  read4  1871047
## 4  read5    69360

# classify taxons at multiple ranks
graph = metagenompy.generate_taxonomy_network(auto_download=True)

rank_list = ['species', 'genus', 'class', 'superkingdom']
df = metagenompy.classify_dataframe(
    graph, df,
    rank_list=rank_list
)

# aggregate read matches
agg_rank = 'genus'
df_agg = metagenompy.aggregate_classifications(df, agg_rank)

df_agg.head()
##            taxid                        species           genus                class superkingdom
## qseqid
## read1    1811693  Pelotomaculum sp. PtaB.Bin104   Pelotomaculum           Clostridia     Bacteria
## read10   2488860         Erythrobacter spongiae   Erythrobacter  Alphaproteobacteria     Bacteria
## read100    78398      Pectobacterium odoriferum  Pectobacterium  Gammaproteobacteria     Bacteria
## read101  1843082           Macromonas sp. BK-30      Macromonas   Betaproteobacteria     Bacteria
## read102  2665644      Paracoccus sp. YIM 132242      Paracoccus  Alphaproteobacteria     Bacteria

# visualize outcome
metagenompy.plot_piechart(df_agg)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

metagenompy-0.4.6.tar.gz (15.9 kB view details)

Uploaded Source

Built Distribution

metagenompy-0.4.6-py3-none-any.whl (14.2 kB view details)

Uploaded Python 3

File details

Details for the file metagenompy-0.4.6.tar.gz.

File metadata

  • Download URL: metagenompy-0.4.6.tar.gz
  • Upload date:
  • Size: 15.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for metagenompy-0.4.6.tar.gz
Algorithm Hash digest
SHA256 18b2bbab4ca92d75040a2a73ca20114b593bf04293c01b1969e560442b301330
MD5 c495b8db4a43034a45186b1d202f94cb
BLAKE2b-256 f9b26be946f47992cb13a8308569e9ba746a1efa6b3df926db2cc7c4864034b6

See more details on using hashes here.

File details

Details for the file metagenompy-0.4.6-py3-none-any.whl.

File metadata

  • Download URL: metagenompy-0.4.6-py3-none-any.whl
  • Upload date:
  • Size: 14.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for metagenompy-0.4.6-py3-none-any.whl
Algorithm Hash digest
SHA256 4289ca316e1fa3f6c12997c4fa994452c3c4b45eb474c037c47b3f8d809df322
MD5 d4094abd17c1a9231de922b0bfb7376e
BLAKE2b-256 ca102a8e4b9e63b740ab675b6c612a356232b94f117fe9d3fbaeccc6276cebaf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page