SPARSE indexes reference genomes in public databases into hierarchical clusters and uses it to predict origins of metagenomic reads.

These details have not been verified by PyPI

Project links

Project description

Strain Prediction and Analysis using Representative SEquences (SPARSE)

SPARSE indexes >100,000 reference genomes in public databases in to hierarchical clusters and uses it to predict origins of metagenomic reads.

Installation

SPARSE runs on Unix and requires Python version 2.7 (Python 3.x supports are under development)

System modules (Ubuntu 16.04) :

pip
gfortran
llvm
libncurses5-dev
cmake
xvfb-run (for malt, optional)

3rd-party software:

samtools (>=1.2)
mash (>=1.1.1)
bowtie2 (>=2.3.2)
malt (>=0.4.0) (optional)

See requirements.txt for python module dependencies.

Installation via PIP [Suggested]

pip install meta-sparse

Installation from source codes (Ubuntu)

sudo apt-get update
sudo apt-get install gfortran llvm libncurses5-dev cmake python-pip samtools bowtie2
git clone https://github.com/zheminzhou/SPARSE
cd SPARSE/EM && make
pip install -r requirements.txt

Updating SPARSE

You can update to latest version using PIP:

pip install --upgrade meta-sparse

If you installed SPARSE from github, move to installation directory and pull the latest version:

cd SPARSE
git pull

Quick Start

See http://sparse.readthedocs.io/en/latest/ for full documentation.

Download reference database

We provide a pre-compiled database based on RefSeq (dated 19.05.2018) to download at http://enterobase.warwick.ac.uk/sparse/refseq_20180519.tar.gz . The database can be downloaded and unpacked by running:

 curl -o refseq_20180519.tar.gz http://enterobase.warwick.ac.uk/sparse/refseq_20180519.tar.gz
 tar -vxzf refseq_20180519.tar.gz

This pre-compiled database is about 350GB and contains four default mapping databases, which can be specified in the next step: representative, subpopulation, Virus, Eukaryota.

To update the database or build a costum database, please refer to the full documentation.

Predict read origins

This following command will map and evaluate all reads in both fastq-files against the specified mapping databases.

sparse predict --dbname refseq_20180519 --mapDB representative,subpopulation,Virus,Eukaryota --r1 read1.fq.gz --r2 read2.fq.gz --workspace <workspace_name>

For single-end reads, only --r1 needs to be specified. All output files are stored in the respective workspace.

Create a report

sparse report <workspace_name>

The report will be stored in <workspace_name>/profile.txt

Extract reference specific reads

The following command extracts all reads specific to the provided reference ids, which can be found in the output of step 2.

sparse extract --dbname refseq_20171014 --workspace <workspace_name> --ref_id <comma delimited indices>

Citation

SPARSE is published as a conference proceeding in "Research in Computational Molecular Biology".

Zhemin Zhou, Nina Luhmann, Nabil-Fareed Alikhan, Christopher Quince, Mark Achtman, 'Accurate Reconstruction of Microbial Strains from Metagenomic Sequencing Using Representative Reference Genomes' RECOMB 2018: Research in Computational Molecular Biology pp 225-240. doi: https://doi.org/10.1007/978-3-319-89929-9_15

A preprint version of the manuscript is also accessible in bioRxiv: https://doi.org/10.1101/215707

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.12

Dec 21, 2018

0.1.11

Jul 18, 2018

0.1.10

Jul 18, 2018

0.1.9

Jul 13, 2018

0.1.8

Jul 6, 2018

0.1.7

Jun 27, 2018

0.1.6

Jun 26, 2018

0.1.5

May 27, 2018

0.1.3

May 23, 2018

0.1.2

Apr 18, 2018

0.1.1

Apr 17, 2018

0.1.0

Apr 16, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

meta-sparse-0.1.12.tar.gz (27.5 MB view details)

Uploaded Dec 21, 2018 Source

File details

Details for the file meta-sparse-0.1.12.tar.gz.

File metadata

Download URL: meta-sparse-0.1.12.tar.gz
Upload date: Dec 21, 2018
Size: 27.5 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.18.4 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.23.0 CPython/2.7.9

File hashes

Hashes for meta-sparse-0.1.12.tar.gz
Algorithm	Hash digest
SHA256	`6d5275c499a3b14ab4cd477f9ec4ba0744a38764a81342ef87c6c0199fb83580`
MD5	`ebd8687ae8ad11543f4ba3609b613447`
BLAKE2b-256	`ffd62d8f4caac92bc100859cf0725649fbbbb6822cfe346944b595665bbf4c71`

See more details on using hashes here.

meta-sparse 0.1.12

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Strain Prediction and Analysis using Representative SEquences (SPARSE)

Installation

Installation via PIP [Suggested]

Installation from source codes (Ubuntu)

Updating SPARSE

Quick Start

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes