Skip to main content

Configurable retrying.

Project description

mask-plasmid: Create a quick BED file from an assembled genome

Build Status

Background

When building phylogenetic trees with microbial genomic data, it is essential to get as close as possible to the clonal frame.

A common technique for identifying the clonal frame is to map reads to a reference genome, and then filter out any sites that are not present in all the samples of interest.

In general, plasmids should not be part of the clonal frame. While theoretically it is possible they are part of the clonal frame, with short read data it is hard to say if the plasmids are all the same. More importantly, however, it is quite impossible to say that the plasmid(s) and the chromosomes have been vertically inherited from the most recent common ancestor of a sample.

Thus, it is generally recommended that plasmids be removed from the analyses.

The problem arises in small read data when it is not quite possible to say with certainty if a read belongs in the plasmid or in the chromosomes. Sometimes plasmids get inserted in chromosomes, sometimes reads should map to a plasmid but erroneously map to the chromosome because the plasmid was not included in the analysis. Thus, in the context of mapping reads to a reference to identify potential variants and the clonal frame, the ambiguous reads (i.e., that could either map on the chromosome or on a plasmid) should be removed from the pool of potential reads used to identify variant sites.

If one removes the plasmids from the reference dataset before attempting to map the reads then it is not possible to identify ambiguous reads. Thus, a better strategy might be to keep the plasmids in the reference dataset, map all the reads, identify variable sites, and then mask the plasmid sites.

This can be achieved using Snippy 4 by using the --mask option and giving it a BED file.

What does this tool do?

This tool will produce a BED file with every locus in a Genbank file, which can be easily edited and then used to --mask plasmids when using Snippy 4.

Installation

pip install mask-plasmid

Running

mask-plasmid <my_gb.gbk[.gz]> > plasmids.bed

Development

Pushing to Pypi

The following command will:

  • run tests
  • clean up the current branch
  • bump the version
  • generate the distributions
  • clean up the current branch
  • tag the commit with the version
  • push to github
  • push to pypi
git commit -a -m <message>
pipenv run inv deploy <new_version_number> [<patch|minor|major>]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

mask_plasmid-0.1.8-py2.py3-none-any.whl (16.5 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file mask_plasmid-0.1.8.macosx-10.13-x86_64.tar.gz.

File metadata

  • Download URL: mask_plasmid-0.1.8.macosx-10.13-x86_64.tar.gz
  • Upload date:
  • Size: 3.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.0 setuptools/40.5.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.7.0

File hashes

Hashes for mask_plasmid-0.1.8.macosx-10.13-x86_64.tar.gz
Algorithm Hash digest
SHA256 c2fad3c92416d06beaca793c020f6fcd2b0b311337b226adaf1a120c33a89791
MD5 fa69d9b509c9546ba15251d9ac148e00
BLAKE2b-256 a94778cbd974ed8fbcbeb5a76e50d10c4b31a6a0e0efcbd9d93cc3fcbdaf26ef

See more details on using hashes here.

File details

Details for the file mask_plasmid-0.1.8-py2.py3-none-any.whl.

File metadata

  • Download URL: mask_plasmid-0.1.8-py2.py3-none-any.whl
  • Upload date:
  • Size: 16.5 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.0 setuptools/40.5.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.7.0

File hashes

Hashes for mask_plasmid-0.1.8-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 35e453dc1ce459382bbc71a3ba545495039ef7d02a6befb6e33d62b51dc70683
MD5 45b6a3cb2a38d15616d0efa2f803023a
BLAKE2b-256 60f51f94b89b8665d195b3b29b29b7284554032ed732c659adb34ee5f4cd820e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page