GENIAL: GENes Indentification with Abricate for Lucky biologists
GENIAL : GENes Identification with Abricate for Lucky biologists
Authors : Barbet Pauline, Felten Arnaud
You can find the latest version of the tool at https://github.com/p-barbet/GENIAL
GENIAL aims to identify antimicrobial resistance and virulence genes from bacterial genomes matching them to a database gathering genes of interest using ABRicate.
As well as this databes, it's posible to use your own database.
The tool is divided into two scripts.
GENIALanalysis aims to run ABricate. It takes in input a tsv file containing genomes fasta files paths and IDs.If you want to use your own database you also need to provide a multifasta whith genes IDs as headers. Then the script run ABricate and produce in output one ABRicate result file per genome, corresponding to a tsv file including genes found in each sample.
GENIALresults aims to conditionning ABRicate results in the form of matrixes and heatmaps of presence/absence. It takes in input a temporary file produced by the Abricate analysis containing the genomes Abricate results paths and IDs. In the case of vfdb database a file containing the virulence factors names, their family and species is automticaly included in the script.
The output depending on the database used :
- In any cases a matrix in tsv format and a heatmap in png format with all genes found are created
On top of that:
If you don't use one of the two previous databases or if you use your own database, only a corespondance table between the gene name and its number in all genomes is produced in addition.
The script has been developed with python 3.6 (tested with 3.6.6)
Command line options
|-f||tsv file with FASTA files paths ans strains IDs||Yes|
|-dbp||Path to ABRicate databases repertory. Implies -dbf and --privatedb||Yes if --privatedb|
|-dbf||Multifasta containing the private database sequences. Implies -dbp and --privatedb||Yes if --privatedb|
|-T||Number of thread to use||No||1|
|-r||Results directory name||No||ABRicate_results|
|--defaultdb||default databases available : resfinder, card, argannot, acoh, ecoli_vf, plasmidfinder, vfdb or ncbi. Incompatible with --privatedb||Yes if not --privatedb|
|--privatedb||Private database name. Implies -dbp and -dbf. Incompatible with --defaultdb||Yes if not --defaultdb|
|--mincov||Minimum proportion of gene covered||No||80|
|--minid||Minimum proportion of exact nucleotide matches||No||90|
|--R||Remove genes present in all genomes from the matrix||No||False|
After installing ABRicate and Pandas and seaborn you can test the script with the command line :
python AntiViruce.py -f input_file.tsv --defaultdb vfdb -r results_directory --minid 90 --mincov 80
python AntiViruce.py -f input_file.tsv --privatedb private_db_name -T 10 -r results_directory --minid 90 --mincov 80 -dbp path_to_abricate_databases_repertory -dbf private_db_multifasta_path
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Hashes for GENIALbiologists-0.9.0-py3-none-any.whl