multi-locus sequence type clade classifier for C.difficile
Project description
MLSTclassifier_cd
Table of Contents
Overview
Enhance your clade prediction process with MLSTclassifier_cd, a powerful machine learning tool that employs K-Nearest Neighbors (KNN) algorithm. Designed specifically for Multi-Locus Sequence Type (MLST) analysis of C.difficile strains, including cryptic variants, this tool streamlines and accelerates clade prediction. MLSTclassifier_cd achieves accuracy of approximately 92% for predictions.
StatQuest methodology was used to build the model (https://www.youtube.com/watch?v=q90UDEgYqeI&t=3327s). Powered by the Scikit-learn library, MLSTclassifier_cd is a good tool to have a first classification of your C.difficile strains including cryptic ones.
GitHub repo: https://github.com/eliottBo/MLSTclassifier_cd
Installation:
Install PyPI package:
pip install mlstclassifier-cd
Usage:
Basic Command:
The query csv file must have the same structure as the example "MLST_file_example.csv".
MLSTclassifier_cd [query csv file path] [output path]
Output:
After running MLSTclassifier_cd, the output file should contain an additional column named "predicted_clade"
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for mlstclassifier_cd-0.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 320ae543263f699454999a4c767f616998c5212e96e79eb72f68676bbb75b8d4 |
|
MD5 | e5bc181968012ce8394c6acbc59acf2c |
|
BLAKE2b-256 | f823d0e70c415225314e7c5b277c4b609cf12cf166883f2082225cfadfb60268 |