GENIAL: GENes Indentification with Abricate for Lucky biologists
Project description
GENIAL : GENes Identification with Abricate for Lucky biologists
Authors : Barbet Pauline, Felten Arnaud
Affiliation: Food Safety Laboratory - ANSES Maisons Alfort (France)
You can find the latest version of the tool at https://github.com/p-barbet/GENIAL
GENIAL
GENIAL aims to identify antimicrobial resistance and virulence genes from bacterial genomes matching them to a database gathering genes of interest using ABRicate.
Databases
Default databases available are (Resfinder, CARD, ARG-ANNOT, NCBI, EcOH, PlasmidFinder, Ecoli_VF and VFDB)
As well as this databes, it's posible to use your own database.
The tool is divided into two scripts.
GENIALanalysis
GENIALanalysis aims to run ABricate. It takes in input a tsv file containing genomes fasta files paths and IDs.If you want to use your own database you also need to provide a multifasta whith genes IDs as headers. Then the script run ABricate and produce in output one ABRicate result file per genome, corresponding to a tsv file including genes found in each sample.
GENIALresults
GENIALresults aims to conditionning ABRicate results in the form of matrixes and heatmaps of presence/absence. It takes in input a temporary file produced by the Abricate analysis containing the genomes Abricate results paths and IDs. In the case of vfdb database a file containing the virulence factors names, their family and species is automticaly included in the script.
The output depending on the database used :
- In any cases a matrix in tsv format and a heatmap in png format with all genes found are created
On top of that:
-
If you use one of the default databases Resfinder or VFDB news matrix and heatmap by gene type are produced with a correspondace table between the gene name, its family and its number in all genomes.
-
If you don't use one of the two previous databases or if you use your own database, only a corespondance table between the gene name and its number in all genomes is produced in addition.
Dependencies
The script has been developed with python 3.6 (tested with 3.6.6)
External dependencies
Parameters
Command line options
Options | Description | Required | Default |
---|---|---|---|
-f | tsv file with FASTA files paths ans strains IDs | Yes | |
-dbp | Path to ABRicate databases repertory. Implies -dbf and --privatedb | Yes if --privatedb | |
-dbf | Multifasta containing the private database sequences. Implies -dbp and --privatedb | Yes if --privatedb | |
-T | Number of thread to use | No | 1 |
-w | Working directory | No | . |
-r | Results directory name | No | ABRicate_results |
--defaultdb | default databases available : resfinder, card, argannot, acoh, ecoli_vf, plasmidfinder, vfdb or ncbi. Incompatible with --privatedb | Yes if not --privatedb | |
--privatedb | Private database name. Implies -dbp and -dbf. Incompatible with --defaultdb | Yes if not --defaultdb | |
--mincov | Minimum proportion of gene covered | No | 80 |
--minid | Minimum proportion of exact nucleotide matches | No | 90 |
--R | Remove genes present in all genomes from the matrix | No | False |
Test
After installing ABRicate and Pandas and seaborn you can test the script with the command line :
Default database
python AntiViruce.py -f input_file.tsv --defaultdb vfdb -r results_directory --minid 90 --mincov 80
Private database
python AntiViruce.py -f input_file.tsv --privatedb private_db_name -T 10 -r results_directory --minid 90 --mincov 80 -dbp path_to_abricate_databases_repertory -dbf private_db_multifasta_path
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for GENIALbiologists-0.9.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 20cb8d33ebef6a8240b4d61edd89ffa384c9eb422cfe0c3a9bf767039bd3b9b6 |
|
MD5 | 4d7e8f5ffaab55009d52267fddaac81a |
|
BLAKE2b-256 | 3f2123a3db9ceef6171f0a4fe3744c25ef87e24c71291a40a3e3c88ed6f9c813 |