No project description provided
Project description
MGE Masker
This package finds MGEs based on annotations in a rich sequence file (Genbank or EMBL). There are 3 subcommands
- find_mges Search a rich sequence file for features annotated with text that suggests a MGE-associated element
- mask_mges Mask regions from a pseudogenome alignment with the regions in a GFF file produced using the find_mges command
- default_matches Show the default regex patterns used when searching for MGEs. This can be overridden by providing a similarly formatted file using the -m parameter with the find_mges command
Deafult patterns matched
.*\b[tT]ranspos
.*\b[pP]hage
.*\b[rR]epeat
.*\b[rR]eptitive
.*\b[iI]nsertion sequence
.*\bIS
.*\b[tT]n
.*\b[iI]ntegr
.*\b[Cc]onjug
.*\b[Pp]lasmid
Installation
Python3 only
pip install MGEmasker
or
pip3 install MGEmasker
Usage
usage: mge_masker [-h] {find_mges,mask_mges,default_matches} ...
A module to find MGEs in a rich sequence file and mask regions corresponding to the MGEs in a pseudogenome alignment.
The find_mges command searches a gbk or embl file for features that have MGE-associated annotations.
It writes a GFF file containing the positions of the matched features.
The mask_mges command takes a GFF file produced using the find_mges command and masks those regions in all sequences of a pseudogenome alignment based on the reference sequence used to find MGEs.
positional arguments:
{find_mges,mask_mges,default_matches}
The following commands are available. Type mge_masker
<COMMAND> -h for more help on a specific commands
find_mges Search a rich sequence file for features annotated
with text that suggests a MGE-associated element
mask_mges Mask regions from a pseudogenome alignment with the
regions in a GFF file produced using the find_mges
command
default_matches Show the default regex patterns used when searching
for MGEs
optional arguments:
-h, --help show this help message and exit
find_mges usage
usage: mge_masker find_mges [-h] -g GENOME_FILE_PATH [-f {genbank,embl}]
[-i MERGE_INTERVAL] [-m MGE_FILE_PATH]
optional arguments:
-h, --help show this help message and exit
-g GENOME_FILE_PATH, --genome_file_path GENOME_FILE_PATH
path to a genome file
-f {genbank,embl}, --file_format {genbank,embl}
genome file format
-i MERGE_INTERVAL, --merge_interval MERGE_INTERVAL
The maximum distance between MGEs when performing the
merging step (Default 1000bp)
-m MGE_FILE_PATH, --mge_file_path MGE_FILE_PATH
path to a file containing regex MGE annotations
usage: mge_masker mask_mges [-h] -a PSEUDOGENOME_ALIGNMENT_PATH -g GFF_FILE_PATH [-m MASKING_CHARACTER]
mask_mges usage
optional arguments:
-h, --help show this help message and exit
-a PSEUDOGENOME_ALIGNMENT_PATH, --pseudogenome_alignment_path PSEUDOGENOME_ALIGNMENT_PATH
path to a pseudogenome alignment file
-g GFF_FILE_PATH, --gff_file_path GFF_FILE_PATH
path to a gff file containing MGE regions to be masked
-m MASKING_CHARACTER, --masking_character MASKING_CHARACTER
character used to mask (default: N)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
MGEmasker-0.1.6.tar.gz
(6.7 kB
view hashes)
Built Distribution
MGEmasker-0.1.6-py3-none-any.whl
(11.5 kB
view hashes)
Close
Hashes for MGEmasker-0.1.6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4ae30a2c58fca6561a2a373bd40d93df7025c350b7f3b15849a1aad1e6e38351 |
|
MD5 | 4e44bb0337065b39a91b926780fc9b35 |
|
BLAKE2b-256 | 2c89a36660e7eb87221435f153e4e1e574e803b51cfee898983429643dcad66f |