MIGHT: MRSN Integrated Genome Handling Tool for bacterial clinical isolates

These details have not been verified by PyPI

Project links

Homepage

Project description

MIGHT

MIGHT: MRSN Integrated Genome Handling Tool for bacterial clinical isolates

Introduction
Installation
- Conda Installation
Usage

Introduction

MIGHT was developed as a way to automate many of the standard bioinformatics tasks that the MRSN performs as part of its surveillance mission.

Brief summary of the workflow:

Run bcl2fastq to demultiplex Illumina paired-end read data from MiSeq/Nextseq data
Run Kraken2 to get species ID and identify possible sample contamination
Preprocess short reads using bbduk for short read data and/or filtlong for long read data
Run the Unicycler assembler (with or without long read data)
Run QUAST to gather assembly statistics
Run Andale, a hybrid read/assembly AMR gene identification tool

Installation

This script is designed to be installed and run using conda

Conda Installation

Usage

MIGHT can be run either on a single isolate using Might.py or on all of the samples of an Illumina run using AllMight.py. The primary difference from an input perspective is that Might.py assumes that you are processing a single sample for which you will provide 1) the sample name and 2) the location(s) or the relevant input files. Conversely, AllMight.py will takes a user provided SampleSheet.csv to determine what samples should be included in the run. It will ultimately run the specified analyses on each sample as parallel implementations of the analysis methods found in Might.py.

For a single isolate:




          .___  ___.  __    _______  __    __  .__________.
          |   \/   | |  |  /  _____||  |  |  | |          |
          |  \  /  | |  | |  |  __  |  |__|  | `---|  |---`
          |  |\/|  | |  | |  | |_ | |   __   |     |  |     
          |  |  |  | |  | |  |__| | |  |  |  |     |  |     
          |__|  |__| |__|  \______| |__|  |__|     |__|     



usage: Might.py --output OUTPUT [--sample-name SAMPLE_NAME] [--fastq FASTQ]
              [--fasta FASTA] [--all] [--kraken2] [--assembly]
              [--amr {combination,reads,contigs,summary}] [--mlst]
              [--plasmidfinder] [--kraken2-database KRAKEN2_DATABASE]
              [--adapter-file ADAPTER_FILE] [--ramdisk RAMDISK] [--update]
              [--force] [--cores CORES] [--verbosity VERBOSITY] [-h]

MIGHT! MRSN Integrated Genome Handling Tool

Required arguments:
--output OUTPUT       path to the directory where output is/will be stored

Input arguments:
--sample-name SAMPLE_NAME
                      Name of the sample to be analyzed.
--fastq FASTQ         path to the directory containing the read files for
                      this sample [output/reads/raw_reads]
--fasta FASTA         path to the directory containing the assembly file for
                      this sample [output/assembly]

Analysis arguments:
--all                 run all analysis options
--kraken2             run Kraken2 on read files to determine species ID and
                      potentially detect contamination
--assembly            trim and filter reads using bbduk, then perform
                      assembly using Unicycler
--amr {combination,reads,contigs,summary}
                      run Andale using one of the four setting choices
--mlst                perform MLST assignments for samples using MLST
--plasmidfinder       run Plasmidfinder on contig files to identify rep gene
                      content

Resource arguments:
--kraken2-database KRAKEN2_DATABASE
                      Path to the kraken2 database. Required for kraken2
                      analysis
--adapter-file ADAPTER_FILE
                      Path to the adapter.fa file required for adapter
                      trimming of Illumina reads
--ramdisk RAMDISK     Path to the ramdisk for speeding up kraken2

Optional arguments:
--update              update AMRFinderPlus and MLST databases
--force               force overwrite of existing data/output related to
                      this sample
--cores CORES         the MAXIMUM number of CPUs to use in the analysis [1]
--verbosity VERBOSITY
                      the level of reporting done to the terminal window [1]

Help:
-h, --help            show this help message and exit

For an Illumina run



            .___  ___.  __    _______  __    __  .__________.
            |   \/   | |  |  /  _____||  |  |  | |          |
            |  \  /  | |  | |  |  __  |  |__|  | `---|  |---`
            |  |\/|  | |  | |  | |_ | |   __   |     |  |     
            |  |  |  | |  | |  |__| | |  |  |  |     |  |     
            |__|  |__| |__|  \______| |__|  |__|     |__|     



usage: AllMight.py --output OUTPUT [--bcl2fastq]
                   [--run-directory RUN_DIRECTORY]
                   [--sample-sheet SAMPLE_SHEET] [--all] [--kraken2]
                   [--assembly] [--amr {combination,reads,contigs,summary}]
                   [--mlst] [--plasmidfinder]
                   [--kraken2-database KRAKEN2_DATABASE]
                   [--adapter-file ADAPTER_FILE] [--ramdisk RAMDISK]
                   [--update] [--force] [--cores CORES]
                   [--verbosity VERBOSITY] [-h]

MIGHT! MRSN Integrated Genome Handling Tool

Required arguments:
  --output OUTPUT       path to the directory where output is/will be stored

bcl2fastq2 arguments:
  --bcl2fastq           Run bcl2fastq2 to generate demultiplexed fastq files
                        from the bcl files
  --run-directory RUN_DIRECTORY
                        Path to the run directory to be analyzed
  --sample-sheet SAMPLE_SHEET
                        Path to the Illumina sample sheet file for the run
                        being analyzed

Analysis arguments:
  --all                 run all analysis options
  --kraken2             run Kraken2 on read files to determine species ID and
                        potentially detect contamination
  --assembly            trim and filter reads using bbduk, then perform
                        assembly using Unicycler
  --amr {combination,reads,contigs,summary}
                        run Andale using one of the four setting choices
  --mlst                perform MLST assignments for samples using MLST
  --plasmidfinder       run Plasmidfinder on contig files to identify rep gene
                        content

Resource arguments:
  --kraken2-database KRAKEN2_DATABASE
                        Path to the kraken2 database. Required for kraken2
                        analysis
  --adapter-file ADAPTER_FILE
                        Path to the adapter.fa file required for adapter
                        trimming of Illumina reads
  --ramdisk RAMDISK     Path to the ramdisk for speeding up kraken2

Optional arguments:
  --update              update AMRFinderPlus and MLST databases
  --force               force overwrite of existing data/output related to
                        this sample
  --cores CORES         the MAXIMUM number of CPUs to use in the analysis [1]
  --verbosity VERBOSITY
                        the level of reporting done to the terminal window [1]

Help:
  -h, --help            show this help message and exit

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

1.0.5

Dec 9, 2019

1.0.4

Dec 6, 2019

1.0.3

Dec 5, 2019

1.0.3rc0 pre-release

Dec 6, 2019

1.0.3b0 pre-release

Dec 6, 2019

1.0.3a0 pre-release

Dec 5, 2019

1.0.2

Dec 5, 2019

1.0.1

Dec 5, 2019

1.0.0

Dec 5, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mrsn-might-1.0.5.tar.gz (32.1 kB view details)

Uploaded Dec 9, 2019 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mrsn_might-1.0.5-py3-none-any.whl (47.1 kB view details)

Uploaded Dec 9, 2019 Python 3

File details

Details for the file mrsn-might-1.0.5.tar.gz.

File metadata

Download URL: mrsn-might-1.0.5.tar.gz
Upload date: Dec 9, 2019
Size: 32.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.38.0 CPython/3.7.3

File hashes

Hashes for mrsn-might-1.0.5.tar.gz
Algorithm	Hash digest
SHA256	`18562e03c4407b84a27b03a158114332f996f6ceade06dae6a9c5aad0a03f646`
MD5	`1198409096684778e35d6f4706ddbbb8`
BLAKE2b-256	`0b075d7836d2ddd4e6e529be0cc3bd187fac362b9e3474640bae1bcfdfc3dcbd`

See more details on using hashes here.

File details

Details for the file mrsn_might-1.0.5-py3-none-any.whl.

File metadata

Download URL: mrsn_might-1.0.5-py3-none-any.whl
Upload date: Dec 9, 2019
Size: 47.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.38.0 CPython/3.7.3

File hashes

Hashes for mrsn_might-1.0.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d419050369427be3b897a2f03ed9ac4d826bc22e310b3379c58487f9fa729cbc`
MD5	`08d4363cad354f11335ec84e7dd83dc5`
BLAKE2b-256	`69f797f30a84323806f0c85398ff03e8421ad8ec7782dfe13dc1759f8cae7053`

See more details on using hashes here.

mrsn-might 1.0.5

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

MIGHT

Contents

Introduction

Installation

Conda Installation

Usage

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes