Skip to main content

GENIAL: GENes Indentification with Abricate for Lucky biologists

Project description

GENIAL : GENes Identification with Abricate for Lucky biologists

Authors : Barbet Pauline, Felten Arnaud

Affiliation: Food Safety Laboratory - ANSES Maisons Alfort (France)

You can find the latest version of the tool at https://github.com/p-barbet/GENIAL

GENIAL

GENIAL aims to identify antimicrobial resistance and virulence genes from bacterial genomes matching them to a database gathering genes of interest using ABRicate.

Databases

Default databases available are (Resfinder, CARD, ARG-ANNOT, NCBI, EcOH, PlasmidFinder, Ecoli_VF and VFDB)

As well as this databes, it's posible to use your own database.

The tool is divided into two scripts.

GENIALanalysis

GENIALanalysis aims to run ABricate. It takes in input a tsv file containing genomes fasta files paths and IDs.If you want to use your own database you also need to provide a multifasta whith genes IDs as headers. Then the script run ABricate and produce in output one ABRicate result file per genome, corresponding to a tsv file including genes found in each sample.

GENIALresults

GENIALresults aims to conditionning ABRicate results in the form of matrixes and heatmaps of presence/absence. It takes in input a temporary file produced by the Abricate analysis containing the genomes Abricate results paths and IDs. In the case of vfdb database a file containing the virulence factors names, their family and species is automticaly included in the script.

The output depending on the database used :

  • In any cases a matrix in tsv format and a heatmap in png format with all genes found are created

On top of that:

  • If you use one of the default databases Resfinder or VFDB news matrix and heatmap by gene type are produced with a correspondace table between the gene name, its family and its number in all genomes.

  • If you don't use one of the two previous databases or if you use your own database, only a corespondance table between the gene name and its number in all genomes is produced in addition.

Dependencies

The script has been developed with python 3.6 (tested with 3.6.6)

External dependencies

Parameters

Command line options

Options Description Required Default
-f tsv file with FASTA files paths ans strains IDs Yes
-dbp Path to ABRicate databases repertory. Implies -dbf and --privatedb Yes if --privatedb
-dbf Multifasta containing the private database sequences. Implies -dbp and --privatedb Yes if --privatedb
-T Number of thread to use No 1
-w Working directory No .
-r Results directory name No ABRicate_results
--defaultdb default databases available : resfinder, card, argannot, acoh, ecoli_vf, plasmidfinder, vfdb or ncbi. Incompatible with --privatedb Yes if not --privatedb
--privatedb Private database name. Implies -dbp and -dbf. Incompatible with --defaultdb Yes if not --defaultdb
--mincov Minimum proportion of gene covered No 80
--minid Minimum proportion of exact nucleotide matches No 90
--R Remove genes present in all genomes from the matrix No False

Test

After installing ABRicate and Pandas and seaborn you can test the script with the command line :

Default database

python AntiViruce.py -f input_file.tsv --defaultdb vfdb -r results_directory --minid 90 --mincov 80

Private database

python AntiViruce.py -f input_file.tsv  --privatedb private_db_name -T 10 -r results_directory --minid 90 --mincov 80 -dbp path_to_abricate_databases_repertory -dbf private_db_multifasta_path

Project details


Release history Release notifications

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for GENIALbiologists, version 0.9.0
Filename, size & hash File type Python version Upload date
GENIALbiologists-0.9.0-py3-none-any.whl (27.0 kB) View hashes Wheel py3
GENIALbiologists-0.9.0.tar.gz (12.8 kB) View hashes Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page