Biosynthetic Gene Cluster finder with Graph Neural Network
Project description
BGCfinder : Biosynthetic Gene Cluster detection with Graph Neural Network
BGCfinder detects biosynthetic gene clusters in bacterial genomes using deep learning. BGCfinder takes a fasta file containing bacterial protein coding sequences and embed each protein sequence into a graph. Graph Neural Network takes the graphs to detect biosynthetic gene cluster.
Author : Jihun Jeung, jihun@gm.gist.ac.kr, jeung4705@gmail.com, https://github.com/jihunni/BGCfinder
Installation requirement:
- PyTorch
- PyTorch Geometric
- Prodigal (https://github.com/hyattpd/Prodigal)
To construct the conda environment,
$ conda create --name BGCfinder python=3.9
$ conda init bash
$ conda activate BGCfinder
$ conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3 -c pytorch
$ conda install pyg -c pyg
$ pip install BGCfinder
To download the BGCfinder model and test files,
$ bgc-download
To find the protein-coding gene in bacterial genome,
$ prodigal -f gff -i bacterial_genome_seq.fasta -a bacterial_protein_seq.fasta -o bacterial_genome_seq.gff
To run BGCfinder with a fasta file containing amino acid sequence with CPU,
$ bgcfinder bacterial_protein_seq.fasta -o output_filename_prefix -l log_record.log -d False
To run BGCfinder with a fasta file containing amino acid sequence with GPU,
$ bgcfinder bacterial_protein_seq.fasta -o output_filename_prefix -l log_record.log -d True
The development environment of BGCfinder :
'torch==1.10.0',
'torch-geometric==2.0.2',
'torch-scatter==2.0.9',
'torch-sparse==0.6.12'
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for BGCfinder-0.0.26-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8a43715b54aa82b55989b60667c09c1a0808e481839298a6fadf6a26d74153d3 |
|
MD5 | 348db7e8519679a43867a17ef7ea0b1c |
|
BLAKE2b-256 | 7610c8025ee635ef56bdc2150e367a491436b82f7f5cc700917b701d30da0733 |