cgMLST analysis tool
Project description
cvmmlst
cvmcgmlst is a tool developed based on the cvmmlst for core genome MLST analysis .
usage: cvmcgmlst -i <genome assemble directory> -o <output_directory>
Author: Qingpo Cui(SZQ Lab, China Agricultural University)
optional arguments:
-h, --help show this help message and exit
-i I <input_path>: the PATH to the directory of assembled genome files. Could not use with -f
-f F <input_file>: the PATH of assembled genome file. Could not use with -i
-db DB <database_path>: path of cgMLST database
-o O <output_directory>: output PATH
-minid MINID <minimum threshold of identity>, default=95
-mincov MINCOV <minimum threshold of coverage>, default=90
-create_db <initialize the reference database>
-t T <number of threads>: default=8
-v, --version Display version
Installation
Using pip
pip3 install cvmcgmlst
Using conda
comming soon...
Dependency
- BLAST+ >2.7.0
you should add BLAST in your PATH
Blast installation
Windows
Following this tutorial: Add blast into your windows PATH
Linux/Mac
The easyest way to install blast is:
conda install -c bioconda blast
Usage
Making your own database
Users could create their own core genome database. All you need is a FASTA file of nucleotide sequences. The sequence IDs should have the format >locus_allelenumber, where locus is the loci name, allelenumber is the number of this allele. The curated core genome fasta file should like this:
>GBAA_RS00015_1
TTGGAAAACATCTCTGATTTATGGAACAGCGCCTTAAAAGAACTCGAAAAAAAGGTCAGT
AAACCAAGTTATGAAACATGGTTAAAATCAACAACCGCACATAATTTAAAGAAAGATGTA
TTAACAATTACGGCTCCAAATGAATTCGCCCGTGATTGGTTAGAATCTCATTATTCAGAG
CTAATTTCGGAAACACTTTATGATTTAACGGGGGCAAAATTAGCTATTCGCTTTATTATT
CCCCAAAGTCAAGCTGAAGAGGAGATTGATCTTCCTCCTGCTAAACCAAATGCAGCACAA
GATGATTCTAATCATTTACCACAGAGTATGCTAAACCCAAAATATACGTTTGATACATTT
GTTATTGGCTCTGGTAACCGTTTTGCTCACGCTGCTTCATTGGCCGTAGCCGAAGCGCCA
GCTAAAGCATATAATCCCCTCTTTATTTATGGGGGAGTTGGACTTGGAAAAACCCATTTA
ATGCATGCAATTGGCCATTATGTAATTGAACATAACCCAAATGCCAAAGTTGTATATTTA
TCATCAGAAAAATTTACAAATGAATTCATTAATTCTATTCGTGATAATAAAGCGGTCGAT
TTTCGTAATAAATACCGCAATGTAGATGTTTTATTGATAGATGATATTCAATTTTTAGCG
GGAAAAGAACAAACTCAAGAAGAGTTTTTCCATACATTCAATGCATTACACGAAGAAAGT
AAACAAATTGTAATTTCCAGTGATCGGCCACCAAAAGAAATTCCAACTTTAGAAGATCGT
CTTCGTTCTCGCTTTGAATGGGGACTCATTACGGATATTACGCCACCAGATTTAGAAACA
CGAATTGCGATTTTACGTAAAAAGGCAAAGGCTGAAGGACTTGATATACCAAATGAGGTC
ATGCTTTATATCGCAAATCAAATCGATTCAAATATTCGTGAACTAGAAGGTGCACTCATC
CGCGTTGTAGCTTATTCATCTTTAATTAACAAGGATATTAATGCTGATTTAGCAGCTGAA
GCACTTAAAGATATTATTCCAAATTCTAAACCAAAAATTATCTCCATTTATGATATTCAA
AAAGCTGTTGGAGATGTTTATCAAGTAAAATTAGAAGATTTCAAGGCGAAAAAGCGCACA
AAGTCAGTTGCCTTTCCTCGCCAAATTGCAATGTATTTGTCACGCGAACTGACAGATTCC
TCCTTACCTAAAATAGGTGAAGAATTTGGTGGACGTGATCATACAACCGTTATCCATGCC
CATGAAAAAATTTCTAAGCTACTTAAGACGGATACGCAATTACAAAAACAAGTTGAAGAA
ATTAACGATATTTTAAAGTAG
The first time when running cvmcgmlst, you should use -create_db parameter to initialize your database. After your own database was created, you could run cvmcgmlst without using -create_db parameter.
You could also create reference database using makeblastdb command.
makeblastdb -hash_index -in reference.fa -dbtype nucl -title cgMLST -parse_seqids
Example
# Single Genome Mode
cvmcgmlst -f /PATH_TO_ASSEBLED_GENOME/sample.fa -create_db -db /PATH_TO_DATABASE/reference.fa -o PATH_TO_OUTPUT
# Batch Mode
cvmcgmlst -i /PATH_TO_ASSEBLED_GENOME_DIR -create_db -db /PATH_TO_DATABASE/reference.fa -o PATH_TO_OUTPUT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file cvmcgmlst-0.1.4.tar.gz
.
File metadata
- Download URL: cvmcgmlst-0.1.4.tar.gz
- Upload date:
- Size: 11.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.26.0 requests-toolbelt/0.9.1 urllib3/1.26.7 tqdm/4.62.3 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bf7416a7f0738a75ac1fbe43f37daddc395d67cfad18548eb48be57c4336f5ff |
|
MD5 | 9461d2e20e5f32128a4250dff07c0200 |
|
BLAKE2b-256 | e005a3ff98cc62ae480153044e00ee4ff4edd902b9d2a51ed0dc4a770601b4f3 |
File details
Details for the file cvmcgmlst-0.1.4-py3-none-any.whl
.
File metadata
- Download URL: cvmcgmlst-0.1.4-py3-none-any.whl
- Upload date:
- Size: 9.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.26.0 requests-toolbelt/0.9.1 urllib3/1.26.7 tqdm/4.62.3 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f5c2a7d2141b8ce1273689e52691c212fe25f834e0500f1f5caef54950a54890 |
|
MD5 | db30628e181c3d3936deb18ef85ee122 |
|
BLAKE2b-256 | 54cc09163f8deb9301050e0dd7ba3cb15180d4241fa4dea7368e33add2350afb |