A tool for generating DNA MTase motif testing sequences
Project description
metmap
DNA methyltransferase binding motif plasmid assembler
Overall purpose:
- To identify the motifs of multiple DNA methyltransferases (DNA MTase) simultaneously.
Quick start
- This script requires python 3.6 or newer.
- Install using "pip install metmap"
- This will put a script named "run_metmap.py" in your pythons bin folder.
- Linux: You should just be able to type "run_metmap.py" from terminal
- Windows: The script will be placed in your python folders "Scripts" subfolder and you can run it with "python path\to\python\Scripts\run_metmap.py" from command line
- Mac: Who knows..probably works like linux?
Test
You can download the tests/test_data.txt and run
run_metmap.py test_data.txt
This should generate a cas1.fa and cas1.gb file.
Overview:
- You have an organism with multiple identified DNA MTases and you wish to know their individual motifs.
- You use NGS to obtain motifs of all methylated DNA sites.
- Some of those motifs will contain ambiguous bases and some will not
- You submit those motifs to this program
- This program then stitches these motifs together in random order
- Motifs may contain ambiguous bases per the IUPAC nucleotide code
- We can't synthesize lots of ambiguous bases, so we "de-ambigulate" them before putting them into the final construct.
- De-ambigulation happens according to 1 of 2 rules:
- Rule 1:
- Pick L random variants of the motif. E.g. Motif ATGNNTTA have a total of 16 possible actual sequences. If L<16 then the program will random pick L variants (without duplicates). If L>16 then each possible variant will be picked at least L/16 times and some will be picked 1 more than that.
- Rule 2:
- Make K copies of each completely "de-ambigulated" variant: E.g. the sequence "SATC" will then be treated as 2 sequences: "GATC" and "CTAC" that will each appear in K copies.
- Rule 1:
- We put M N's between each motif
- And the program will output P versions of these cassettes
- You then clone this cassette into a plasmid with 1 DNA MTase in each plasmid.
- You then transform this library into an organism that doesnt natively methylate DNA.
- Grow, Harvest, Sequence plasmids.
- ?
- Profit!
Motif file format
- The motifs should be stored in a standard text file
- One motif per line, then a comma then a 1 or a 2 to indicate whether either rule 1 or 2 should be used for this motif
Example:
ATGCATGCATGC, 1
STGCAGTCATCGTTK, 1
ATCNNNNAAA, 2
CGTAGCANNNATCGATGC, 2
IUPAC nucleotide code:
code | nucs |
---|---|
R | A or G |
Y | C or T |
S | G or C |
W | A or T |
K | G or T |
M | A or C |
B | C or G or T |
D | A or G or T |
H | A or C or T |
V | A or C or G |
N | any base |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
metmap-1.0.4.tar.gz
(5.8 kB
view details)
Built Distribution
File details
Details for the file metmap-1.0.4.tar.gz
.
File metadata
- Download URL: metmap-1.0.4.tar.gz
- Upload date:
- Size: 5.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.7.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0acae02a22a52e96941f59b754666e0d8b985bcb6aa12384047e8b28073e99fe |
|
MD5 | a3fbab8efe5acc30080e724cfc27a999 |
|
BLAKE2b-256 | f80fd2b8e51a13659f46c7cbc3a447ea1e0e22ecf47ba49cd1903021be5d7515 |
File details
Details for the file metmap-1.0.4-py3-none-any.whl
.
File metadata
- Download URL: metmap-1.0.4-py3-none-any.whl
- Upload date:
- Size: 6.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.7.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1d4f10b59da1b77025b0f671c09a8bc996c3ba8180257bcdcaa6ba39f60c6050 |
|
MD5 | f42bd62f8e04b411e074acd3e178d14e |
|
BLAKE2b-256 | 09aae1ae4ce003d298b1fed40e36485693088ebd19daa4c804c10318a6c3e86f |