Skip to main content

Internal BFSSI package for assembling prokaryotic genomes from short reads

Project description

ProkaryoteAssembly

Two simple scripts to assemble prokaryotic genomes using paired-end reads.

Pipeline Overview

  1. QC on reads with bbduk.sh (adapter trimming/quality filtering)
  2. Error-correction of reads with tadpole.sh
  3. Assembly of reads with skesa
  4. Alignment of error-corrected reads against draft assembly with bbmap.sh
  5. Polishing of assembly with pilon

Installation

pip install ProkaryoteAssembly

Usage

The first script, prokaryote_assemble.py, operates on a single sample at a time.

Usage: prokaryote_assemble.py [OPTIONS]

Options:
  -1, --fwd_reads PATH  Path to forward reads (R1).  [required]
  -2, --rev_reads PATH  Path to reverse reads (R2).  [required]
  -o, --out_dir PATH    Root directory to store all output files.  [required]
  --cleanup             Specify this flag to remove everything except the final assembly upon completion.
  --version             Specify this flag to print the version and exit.
  --help                Show this message and exit.

The second script, prokaryote_assemble_dir.py, will detect all *.fastq.gz files in a directory and run the assembly pipeline on each sample it can pair.

Usage: prokaryote_assemble_dir.py [OPTIONS]

Options:
  -i, --input_dir PATH  Directory containing all *.fastq.gz files to assemble.
                        [required]
  -o, --out_dir PATH    Root directory to store all output files.  [required]
  -f, --fwd_id TEXT     Pattern to detect forward reads. Defaults to "_R1".
  -r, --rev_id TEXT     Pattern to detect reverse reads. Defaults to "_R2".
  --cleanup             Specify this flag to remove everything except the final assembly upon completion.
  --help                Show this message and exit.

Python (3.6) Dependencies

  • click

External Dependencies

NOTE: All external dependencies must be available via PATH.

Versions confirmed to work are in brackets.

  • skesa (SKESA v.2.1-SVN_551987:557549M)
  • BBMap (BBMap version 38.22)
  • samtools (samtools 1.8 using htslib 1.8)
  • pilon (Pilon version 1.22)

Note: Strongly recommend installing pilon via conda e.g. https://bioconda.github.io/recipes/pilon/README.html

conda install pilon

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ProkaryoteAssembly-0.1.2.tar.gz (5.6 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page