Skip to main content

cgMLST analysis tool

Project description

cvmmlst

cvmcgmlst is a tool developed based on the cvmmlst for core genome MLST analysis .

usage: cvmcgmlst -i <genome assemble directory> -o <output_directory>

Author: Qingpo Cui(SZQ Lab, China Agricultural University)

optional arguments:
  -h, --help      show this help message and exit
  -i I            <input_path>: the PATH to the directory of assembled genome files. Could not use with -f
  -f F            <input_file>: the PATH of assembled genome file. Could not use with -i
  -db DB          <database_path>: path of cgMLST database
  -o O            <output_directory>: output PATH
  -minid MINID    <minimum threshold of identity>, default=95
  -mincov MINCOV  <minimum threshold of coverage>, default=90
  -create_db      <initialize the reference database>
  -t T            <number of threads>: default=8
  -v, --version   Display version

Installation

Using pip

pip3 install cvmcgmlst

Using conda

comming soon...

Dependency

  • BLAST+ >2.7.0

you should add BLAST in your PATH

Blast installation

Windows

Following this tutorial: Add blast into your windows PATH

Linux/Mac

The easyest way to install blast is:

conda install -c bioconda blast

Usage

Making your own database

Users could create their own core genome database. All you need is a FASTA file of nucleotide sequences. The sequence IDs should have the format >locus_allelenumber, where locus is the loci name, allelenumber is the number of this allele. The curated core genome fasta file should like this:

>GBAA_RS00015_1
TTGGAAAACATCTCTGATTTATGGAACAGCGCCTTAAAAGAACTCGAAAAAAAGGTCAGT
AAACCAAGTTATGAAACATGGTTAAAATCAACAACCGCACATAATTTAAAGAAAGATGTA
TTAACAATTACGGCTCCAAATGAATTCGCCCGTGATTGGTTAGAATCTCATTATTCAGAG
CTAATTTCGGAAACACTTTATGATTTAACGGGGGCAAAATTAGCTATTCGCTTTATTATT
CCCCAAAGTCAAGCTGAAGAGGAGATTGATCTTCCTCCTGCTAAACCAAATGCAGCACAA
GATGATTCTAATCATTTACCACAGAGTATGCTAAACCCAAAATATACGTTTGATACATTT
GTTATTGGCTCTGGTAACCGTTTTGCTCACGCTGCTTCATTGGCCGTAGCCGAAGCGCCA
GCTAAAGCATATAATCCCCTCTTTATTTATGGGGGAGTTGGACTTGGAAAAACCCATTTA
ATGCATGCAATTGGCCATTATGTAATTGAACATAACCCAAATGCCAAAGTTGTATATTTA
TCATCAGAAAAATTTACAAATGAATTCATTAATTCTATTCGTGATAATAAAGCGGTCGAT
TTTCGTAATAAATACCGCAATGTAGATGTTTTATTGATAGATGATATTCAATTTTTAGCG
GGAAAAGAACAAACTCAAGAAGAGTTTTTCCATACATTCAATGCATTACACGAAGAAAGT
AAACAAATTGTAATTTCCAGTGATCGGCCACCAAAAGAAATTCCAACTTTAGAAGATCGT
CTTCGTTCTCGCTTTGAATGGGGACTCATTACGGATATTACGCCACCAGATTTAGAAACA
CGAATTGCGATTTTACGTAAAAAGGCAAAGGCTGAAGGACTTGATATACCAAATGAGGTC
ATGCTTTATATCGCAAATCAAATCGATTCAAATATTCGTGAACTAGAAGGTGCACTCATC
CGCGTTGTAGCTTATTCATCTTTAATTAACAAGGATATTAATGCTGATTTAGCAGCTGAA
GCACTTAAAGATATTATTCCAAATTCTAAACCAAAAATTATCTCCATTTATGATATTCAA
AAAGCTGTTGGAGATGTTTATCAAGTAAAATTAGAAGATTTCAAGGCGAAAAAGCGCACA
AAGTCAGTTGCCTTTCCTCGCCAAATTGCAATGTATTTGTCACGCGAACTGACAGATTCC
TCCTTACCTAAAATAGGTGAAGAATTTGGTGGACGTGATCATACAACCGTTATCCATGCC
CATGAAAAAATTTCTAAGCTACTTAAGACGGATACGCAATTACAAAAACAAGTTGAAGAA
ATTAACGATATTTTAAAGTAG

The first time when running cvmcgmlst, you should use -create_db parameter to initialize your database. After your own database was created, you could run cvmcgmlst without using -create_db parameter.

You could also create reference database using makeblastdb command.

makeblastdb -hash_index -in reference.fa -dbtype nucl -title cgMLST -parse_seqids

Example

# Single Genome Mode
cvmcgmlst -f /PATH_TO_ASSEBLED_GENOME/sample.fa -create_db -db /PATH_TO_DATABASE/reference.fa -o PATH_TO_OUTPUT

# Batch Mode
cvmcgmlst -i /PATH_TO_ASSEBLED_GENOME_DIR -create_db -db /PATH_TO_DATABASE/reference.fa -o PATH_TO_OUTPUT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cvmcgmlst-0.1.7.tar.gz (33.7 kB view details)

Uploaded Source

Built Distribution

cvmcgmlst-0.1.7-py3-none-any.whl (9.9 kB view details)

Uploaded Python 3

File details

Details for the file cvmcgmlst-0.1.7.tar.gz.

File metadata

  • Download URL: cvmcgmlst-0.1.7.tar.gz
  • Upload date:
  • Size: 33.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.26.0 requests-toolbelt/0.9.1 urllib3/1.26.7 tqdm/4.62.3 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.8

File hashes

Hashes for cvmcgmlst-0.1.7.tar.gz
Algorithm Hash digest
SHA256 48d3cf02cd4d236c5bff0f26768ea518cbd3124e7ae2adb64dd70e0bbbc71f65
MD5 6750922e269bf6db70c99e1d24abb99e
BLAKE2b-256 44c1f3e2cb5b6acd8a951c74390a7de94f9ebf2419ab66de2b845d26dfffe654

See more details on using hashes here.

File details

Details for the file cvmcgmlst-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: cvmcgmlst-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 9.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.26.0 requests-toolbelt/0.9.1 urllib3/1.26.7 tqdm/4.62.3 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.8

File hashes

Hashes for cvmcgmlst-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 5839a77a45ae55a00a87d6278070cea0fdc5f34776d2819c19a119fc3f37be32
MD5 3525993a2341755c760e7c080b2ed5d6
BLAKE2b-256 1746862ebe44cde8373f4e04041ff89ce8bdaf3827bd4adfa75e2b6d3abfe054

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page