Internal BFSSI package for assembling prokaryotic genomes from short reads
Project description
ProkaryoteAssembly
Two simple scripts to assemble prokaryotic genomes using paired-end reads.
Pipeline Overview
- QC on reads with bbduk.sh (adapter trimming/quality filtering)
- Error-correction of reads with tadpole.sh
- Assembly of reads with skesa
- Alignment of error-corrected reads against draft assembly with bbmap.sh
- Polishing of assembly with pilon
Installation
pip install ProkaryoteAssembly
Usage
The first script, prokaryote_assemble.py
, operates on a single sample at a time.
Usage: prokaryote_assemble.py [OPTIONS]
Options:
-1, --fwd_reads PATH Path to forward reads (R1) (gzipped FASTQ).
[required]
-2, --rev_reads PATH Path to reverse reads (R2) (gzipped FASTQ).
[required]
-o, --out_dir PATH Root directory to store all output files. [required]
-m, --memory TEXT Amount of memory to allocate to job. e.g. "8g".
Defaults to 8g.
--cleanup Specify this flag to delete all intermediary files
except the resulting FASTA assembly.
--version Specify this flag to print the version and exit.
--help Show this message and exit.
The second script, prokaryote_assemble_dir.py
, will detect all *.fastq.gz files in
a directory and run the assembly pipeline on each sample it can pair.
Usage: prokaryote_assemble_dir.py [OPTIONS]
Options:
-i, --input_dir PATH Directory containing all *.fastq.gz files to
assemble.NOTE: Files must be gzipped in order to be
detected. [required]
-o, --out_dir PATH Root directory to store all output files. [required]
-f, --fwd_id TEXT Pattern to detect forward reads. Defaults to "_R1".
-r, --rev_id TEXT Pattern to detect reverse reads. Defaults to "_R2".
-m, --memory TEXT Memory to allocate to pilon call. Defaults to 8g (i.e.
pilon -Xmx8g). May need to provide a large amount of
memory for large read sets/assemblies.
--cleanup Specify this flag to delete all intermediary files
except the resulting FASTA assembly.
--version Specify this flag to print the version and exit.
--help Show this message and exit.
Python (3.6) Dependencies
- click
External Dependencies
NOTE: All external dependencies must be available via PATH.
Versions confirmed to work are in brackets.
- skesa (SKESA v.2.1-SVN_551987:557549M)
- BBMap (BBMap version 38.22)
- samtools (samtools 1.8 using htslib 1.8)
- pilon (Pilon version 1.22)
Note: Strongly recommend installing pilon via conda e.g. https://bioconda.github.io/recipes/pilon/README.html
conda install pilon
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file ProkaryoteAssembly-0.1.6.tar.gz
.
File metadata
- Download URL: ProkaryoteAssembly-0.1.6.tar.gz
- Upload date:
- Size: 6.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 360e41b7ce57e0930a36c7fcbd50ac6e0ef73098b954914cbe85cdf4f7abce8d |
|
MD5 | baa0c738871c06d2bb448aed473f54ad |
|
BLAKE2b-256 | 9d8a41ece43539234ebab781571c043eb9d67b6c67f742df9c7c0e6e7bc63dde |