Skip to main content
Join the official 2019 Python Developers SurveyStart the survey!

Graphing bacterial genomes for fun and profit

Project description

prairiedog

.daisy.png https://circleci.com/gh/superphy/prairiedog.svg?style=svg https://codecov.io/gh/superphy/prairiedog/branch/master/graph/badge.svg

Supports Python 3.6, Python 3.7, PyPy 3.6 on Linux

Installation

We recommend you follow both the installation step for graph creation and for querying the graph, unless you are computing the graph in one place, and querying it in another.

Both steps require you to first install lemongraph.

Clone prairiedog and install lemongraph

git clone --recursive https://github.com/superphy/prairiedog.git
cd prairiedog/
python3 -m venv venv
. venv/bin/activate
cd lemongraph/
apt-get install libffi-dev zlib1g-dev python-dev python-cffi
python setup.py install

For creating a graph

. venv/bin/activate
pip install -r requirements.txt
pip install snakemake

For querying an existing graph

. venv/bin/activate
python setup.py install

Usage

Docker

docker run -v /abs-path-to/outputs/:/p/outputs/ -v /abs-path-to/samples/:/p/samples/ superphy/prairiedog dgraph

For creating a graph

. venv/bin/activate
snakemake -j 24 --config samples=samples/

For querying an existing graph

Via docker

# Without debug
docker run -v /home/kevin/pdg-test/outputs:/p/outputs/ -v /home/kevin/pdg-test/samples:/p/samples/ superphy/prairiedog:c6ff5c63779a73de02c9b3de0f4225b29564f285 query TCGAGCATTAT GCATAGGCAAC
# With debug
docker run -v /home/kevin/pdg-test/outputs:/p/outputs/ -v /home/kevin/pdg-test/samples:/p/samples/ superphy/prairiedog:c6ff5c63779a73de02c9b3de0f4225b29564f285 --debug query TCGAGCATTAT GCATAGGCAAC

or virtualenv

. venv/bin/activate
prairiedog ATACGACGCCA CGTCCGGACGT

You should get something like:

prairiedog GGGCGTTAAGT GGCAGGTTGAA
prairiedog[21238] INFO Looking for all strings between GGGCGTTAAGT and GGCAGGTTGAA ...
prairiedog[21238] INFO Found {'string': 'GGGCGTTAAGTTGCAGGGTATAGACCCGAAACCCGGTGATCTAGCCATGGGCAGGTTGAA', 'edge_type': 'SRR3295769.fasta', 'edge_value': '>SRR3295769.fasta|NODE_75_length_556_cov_349.837_ID_5290_pilon'}
prairiedog[21238] INFO Found {'string': 'GGGCGTTAAGTTGCAGGGTATAGACCCGAAACCCGGTGATCTAGCCATGGGCAGGTTGAA', 'edge_type': 'SRR3665189.fasta', 'edge_value': '>SRR3665189.fasta|NODE_60_length_523_cov_287.621_ID_4672'}

Tests & Benchmarks

Test genomes are included in the tests/ folders, while genomes for benchmarking should be included in the samples/ folder. To run tests and benchmarks:

python3 -m venv venv
. venv/bin/activate
pip install tox
tox -v

History

0.2.0 (2019-07-28)

  • Pangenome graph creation via Dgraph
  • Queries between kmers via Dgraph

0.1.2 (2019-06-21)

  • Supports Pangenome graph creation
  • Uses LemonGraph as backend
  • Supports queries between any two kmers

0.1.1 (2019-05-25)

  • Initial Snakefile for creating graphs
  • Still need to add node_labels

0.1.0 (2019-05-08)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for prairiedog, version 0.2.0
Filename, size File type Python version Upload date Hashes
Filename, size prairiedog-0.2.0-py2.py3-none-any.whl (27.9 kB) File type Wheel Python version py2.py3 Upload date Hashes View hashes
Filename, size prairiedog-0.2.0.tar.gz (4.8 MB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page