Graphing bacterial genomes for fun and profit
Project description
prairiedog
Supports Python 3.6, Python 3.7, PyPy 3.6 on Linux
Installation
We recommend you follow both the installation step for graph creation and for querying the graph, unless you are computing the graph in one place, and querying it in another.
Both steps require you to first install lemongraph.
Clone prairiedog and install lemongraph
git clone --recursive https://github.com/superphy/prairiedog.git cd prairiedog/ python3 -m venv venv . venv/bin/activate cd lemongraph/ apt-get install libffi-dev zlib1g-dev python-dev python-cffi python setup.py install
For creating a graph
. venv/bin/activate pip install -r requirements.txt pip install snakemake
For querying an existing graph
. venv/bin/activate python setup.py install
Usage
Docker
docker run -v /abs-path-to/outputs/:/p/outputs/ -v /abs-path-to/samples/:/p/samples/ superphy/prairiedog dgraph
For creating a graph
. venv/bin/activate snakemake -j 24 --config samples=samples/
For querying an existing graph
Via docker
# Without debug docker run -v /home/kevin/pdg-test/outputs:/p/outputs/ -v /home/kevin/pdg-test/samples:/p/samples/ superphy/prairiedog:c6ff5c63779a73de02c9b3de0f4225b29564f285 query TCGAGCATTAT GCATAGGCAAC # With debug docker run -v /home/kevin/pdg-test/outputs:/p/outputs/ -v /home/kevin/pdg-test/samples:/p/samples/ superphy/prairiedog:c6ff5c63779a73de02c9b3de0f4225b29564f285 --debug query TCGAGCATTAT GCATAGGCAAC
or virtualenv
. venv/bin/activate prairiedog ATACGACGCCA CGTCCGGACGT
You should get something like:
prairiedog GGGCGTTAAGT GGCAGGTTGAA prairiedog[21238] INFO Looking for all strings between GGGCGTTAAGT and GGCAGGTTGAA ... prairiedog[21238] INFO Found {'string': 'GGGCGTTAAGTTGCAGGGTATAGACCCGAAACCCGGTGATCTAGCCATGGGCAGGTTGAA', 'edge_type': 'SRR3295769.fasta', 'edge_value': '>SRR3295769.fasta|NODE_75_length_556_cov_349.837_ID_5290_pilon'} prairiedog[21238] INFO Found {'string': 'GGGCGTTAAGTTGCAGGGTATAGACCCGAAACCCGGTGATCTAGCCATGGGCAGGTTGAA', 'edge_type': 'SRR3665189.fasta', 'edge_value': '>SRR3665189.fasta|NODE_60_length_523_cov_287.621_ID_4672'}
Tests & Benchmarks
Test genomes are included in the tests/ folders, while genomes for benchmarking should be included in the samples/ folder. To run tests and benchmarks:
python3 -m venv venv . venv/bin/activate pip install tox tox -v
History
0.2.0 (2019-07-28)
Pangenome graph creation via Dgraph
Queries between kmers via Dgraph
0.1.2 (2019-06-21)
Supports Pangenome graph creation
Uses LemonGraph as backend
Supports queries between any two kmers
0.1.1 (2019-05-25)
Initial Snakefile for creating graphs
Still need to add node_labels
0.1.0 (2019-05-08)
First release on PyPI.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for prairiedog-0.2.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8eed8039e702d6995419c3cbf3a989b220befb75cf33be62ea1e786334b5ac54 |
|
MD5 | 27ada24278323056ba1d4bead18f02ea |
|
BLAKE2b-256 | 1ef287776083c03cf648a43e21acef9cf4c61ee9aecb27697544cb4444e4a56e |