Skip to main content

No project description provided

Project description

presented by Institute of Food Safety and Health, Illinois Institute of Technology

PlasmidHunter: Accurate and Fast Plasmid Prediction Based on Gene Content Using Machine Learning

Plasmids are extrachromosomal DNA found in microorganisms. They often carry beneficial genes that help bacteria adapt to harsh conditions. Plasmids are also important tools in genetic engineering, gene therapy, and drug production. However, it can be difficult to identify plasmid sequences from chromosomal sequences in genomic and metagenomic data. Here, we have developed a new tool called PlasmidHunter, which uses machine learning to predict plasmid sequences based on gene content profile. PlasmidHunter achieved high accuracies (up to 97.6%) and high speeds in benchmark tests including both simulated contigs and real metagenomic plasmidome data, outperforming other existing tools.

Keywords: artificial intelligence (AI), machine learning (ML), plasmid prediction, genomic sequencing

Installation and run

conda create -n plasmidhunter -c bioconda -c conda-forge -y python=3.10 diamond=2.1.8 prodigal

conda activate plasmidhunter

pip install plasmidhunter

plasmidhunter -h

Result Interpretation

The result is a tab-delimited table showing the prediction of each sequence. The columns include Prediction (0: chromosome, 1: plasmid), Probability of 0 (chromosome), and Probability of 1 (plasmid).

Change Log

v1.1 9/1/2022

PlasmidHunter is now using much less memory.

v1.2 11/23/2023

PlasmidHunter has an expanded database for a higher annotation rate.

PlasmidHunter is now accepting shorter contigs down to 1 Kbp and has a higher accuracy for short contigs.

v1.3 4/2024 Fixed some minor bugs

v1.4 5/17/2024

Converted model and feature pickle file into parameter file and text file, respectively.

Citation

PlasmidHunter: Accurate and fast prediction of plasmid sequences using gene content profile and machine learning

Renmao Tian, Jizhong Zhou, Behzad Imanian

bioRxiv 2023.02.01.526640; doi: https://doi.org/10.1101/2023.02.01.526640

Contact

If you have any questions, please contact Renmao Tian (tianrenmao[at]gmail.com) or Behzad Imanian (bimanian[at]iit.edu).

License

Educational Community License, Version 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

plasmidhunter-1.4.tar.gz (4.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

plasmidhunter-1.4-py3-none-any.whl (4.4 MB view details)

Uploaded Python 3

File details

Details for the file plasmidhunter-1.4.tar.gz.

File metadata

  • Download URL: plasmidhunter-1.4.tar.gz
  • Upload date:
  • Size: 4.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.9.18

File hashes

Hashes for plasmidhunter-1.4.tar.gz
Algorithm Hash digest
SHA256 cfcdd61b5120a51ba5728f2ed8d55ab5c84fd767106f2bc7127c75d5acfac7af
MD5 7abcbc23dbea48770f5fa74e3cf10f93
BLAKE2b-256 2a6f4b6e5e3df99ddb6c7aef47906c203ea095141796a5bfacec5f13e594fb20

See more details on using hashes here.

File details

Details for the file plasmidhunter-1.4-py3-none-any.whl.

File metadata

  • Download URL: plasmidhunter-1.4-py3-none-any.whl
  • Upload date:
  • Size: 4.4 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.9.18

File hashes

Hashes for plasmidhunter-1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 e91b1e3d110c4c239a2198505678acc5105bcc68c966f4adc513d87d9a145727
MD5 db34a7c913c6e0e251d9420160553384
BLAKE2b-256 b3cd56f56dddfcb0ce49cea87fbe8bf485e6e57f1c493db60150833ce1d61236

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page