No project description provided

These details have not been verified by PyPI

Project description

Migenpro

Coverage codequality

Getting started

Pull the git repo:

git pull git@gitlab.com:pig-paradigm/migenpro.git
cd migenpro

Installing the needed dependencies.

A pip requirements.txt file is located in the installation directory which you can install using the following command.

conda create -n migenpro python=3.12.5 pip --file installation/requirements.txt

Annotating genomes using SAPP

To annotate genomes we use a cwltool workflow with SAPP that output the desired genome annotations in hdt files.

cwltool --no-warnings --outdir ./data https://gitlab.com/m-unlock/cwl/-/raw/dev/workflows/workflow_microbial_annotation.cwl --genome_fasta https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/005/845/GCA_000005845.2_ASM584v2/GCA_000005845.2_ASM584v2_genomic.fna.gz

Luckily we have automated this process within the python package.

Training machine learning models

python3 src/main/resources/python/machineLearning.py \
    --featureMatrix ./output/phentype_matrix.tsv \
    --phenotypeMatrix output/protein_domain_matrix.tsv \
    --model_load [Location_of_model] \
    --train
    --predict

Predicting phenotypes with existing models

You can do this through the docker container or from the source code.

You will need to obtain a protein domain matrix of the desired genomes you can do this using the java code.
For ease of use we will use the python scripts that were made with the following command. The default output directory is "output/mloutput" if desired you can change this using the --output [output_directory_location]

python3 src/main/resources/python/machineLearning.py \
    --featureMatrix ./output/phentype_matrix.tsv \
    --model_load [Location_of_model] \
    --predict

Wait for the script to finish and retrieve the results of your prediction from the output directory. There the predictions are given in the following format:

################################################
# Genome # Phenotype # Prediction # Confidence #
# GCA123 # Temprature # mesophilic # 0.96      #
################################################

Recreating the results from the study

The files needed to recreate our results are located in the ./data/phenotype_output folder. We use the previously created protein_domain.tsv and phenotype.tsv files. Run the create_graphs.sh bash script

./recreate.sh

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.4

Dec 17, 2025

0.1.3

Dec 15, 2025

0.1.2

Sep 4, 2025

This version

0.1.0

Sep 4, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

migenpro-0.1.0.tar.gz (38.2 kB view details)

Uploaded Sep 4, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

migenpro-0.1.0-py3-none-any.whl (50.3 kB view details)

Uploaded Sep 4, 2025 Python 3

File details

Details for the file migenpro-0.1.0.tar.gz.

File metadata

Download URL: migenpro-0.1.0.tar.gz
Upload date: Sep 4, 2025
Size: 38.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for migenpro-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`eb65863ca4d03afa2fb7d94ef1b4c4659584156cd7666c932f1ef7ed98a3c39b`
MD5	`4355068f363a133904ac3b0eae462925`
BLAKE2b-256	`dc2c432075624827d1be5f7b89c6c8131d2b459e142c33d8b8d9fd2ff89e3067`

See more details on using hashes here.

File details

Details for the file migenpro-0.1.0-py3-none-any.whl.

File metadata

Download URL: migenpro-0.1.0-py3-none-any.whl
Upload date: Sep 4, 2025
Size: 50.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for migenpro-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`401e321c9db5a22aefbc0e63987eb1d7a9901c1ed79e3f72bf5c073d9a1fa83b`
MD5	`f9925bfccfd4f7c084b93fd5c05945ef`
BLAKE2b-256	`5f416e0556fe4149109c5846d0962295a7f8bf91f4dc8c077bdd01af728fb4de`

See more details on using hashes here.

MiGenPro 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Migenpro

Getting started

Installing the needed dependencies.

Annotating genomes using SAPP

Training machine learning models

Predicting phenotypes with existing models

Recreating the results from the study

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes