multi-locus sequence type clade classifier for C.difficile
Project description
MLSTclassifier_cd
Table of Contents
Overview
Enhance your clade prediction process with MLSTclassifier_cd, a powerful machine learning tool that employs K-Nearest Neighbors (KNN) algorithm. Designed specifically for Multi-Locus Sequence Type (MLST) analysis of C.difficile strains, including cryptic variants, this tool streamlines and accelerates clade prediction. MLSTclassifier_cd achieves accuracy of approximately 92% for predictions.
StatQuest methodology was used to build the model (https://www.youtube.com/watch?v=q90UDEgYqeI&t=3327s). Powered by the Scikit-learn library, MLSTclassifier_cd is a good tool to have a first classification of your C.difficile strains including cryptic ones.
Installation:
Install PyPI package:
pip install MLSTclassifier_cd
Usage:
Basic Command:
The query csv file must have the same structure as the example "MLST_file_example.csv".
MLSTclassifier_cd [query csv file path] [output path]
Output:
After running MLSTclassifier_cd, the output file should contain an additional column named "predicted_clade"
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for mlstclassifier_cd-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 00805ed062fb6652aba947e8a38d62ea1ec111961336bfcdc863c210ab9840ad |
|
MD5 | a4f1130c49a22fff23a1c84b3296b295 |
|
BLAKE2b-256 | 68cbe2798c744013992f94015441b1bc9f4741cfb50957ed38b61b1f26aa1aa4 |