GENIAL: GENes Indentification with Abricate for Lucky biologists

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: GNU General Public License v2 (GPLv2)
Operating System
- POSIX :: Linux
Programming Language
- Python :: 3

Project description

GENIAL : GENes Identification with Abricate for Lucky biologists

Authors : Barbet Pauline, Felten Arnaud

Affiliation: Food Safety Laboratory - ANSES Maisons Alfort (France)

You can find the latest version of the tool at https://github.com/p-barbet/GENIAL

GENIAL

GENIAL aims to identify antimicrobial resistance and virulence genes from bacterial genomes matching them to a database gathering genes of interest using ABRicate.

Databases

Default databases available are (Resfinder, CARD, ARG-ANNOT, NCBI, EcOH, PlasmidFinder, Ecoli_VF and VFDB)

As well as this databes, it's posible to use your own database.

The tool is divided into two scripts.

GENIALanalysis

GENIALanalysis aims to run ABricate. It takes in input a tsv file containing genomes fasta files paths and IDs.If you want to use your own database you also need to provide a multifasta whith genes IDs as headers. Then the script run ABricate and produce in output one ABRicate result file per genome, corresponding to a tsv file including genes found in each sample.

GENIALresults

GENIALresults aims to conditionning ABRicate results in the form of matrixes and heatmaps of presence/absence. It takes in input a temporary file produced by the Abricate analysis containing the genomes Abricate results paths and IDs. In the case of vfdb database a file containing the virulence factors names, their family and species is automticaly included in the script.

The output depending on the database used :

In any cases a matrix in tsv format and a heatmap in png format with all genes found are created

On top of that:

If you use one of the default databases Resfinder or VFDB news matrix and heatmap by gene type are produced with a correspondace table between the gene name, its family and its number in all genomes.
If you don't use one of the two previous databases or if you use your own database, only a corespondance table between the gene name and its number in all genomes is produced in addition.

Dependencies

The script has been developed with python 3.6 (tested with 3.6.6)

External dependencies

ABRicate tested with 0.8.7
Pandas tested with 0.23.4
seaborn tested with 0.9.0

Parameters

Command line options

Options	Description	Required	Default
-f	tsv file with FASTA files paths ans strains IDs	Yes
-dbp	Path to ABRicate databases repertory. Implies -dbf and --privatedb	Yes if --privatedb
-dbf	Multifasta containing the private database sequences. Implies -dbp and --privatedb	Yes if --privatedb
-T	Number of thread to use	No	1
-w	Working directory	No	.
-r	Results directory name	No	ABRicate_results
--defaultdb	default databases available : resfinder, card, argannot, acoh, ecoli_vf, plasmidfinder, vfdb or ncbi. Incompatible with --privatedb	Yes if not --privatedb
--privatedb	Private database name. Implies -dbp and -dbf. Incompatible with --defaultdb	Yes if not --defaultdb
--mincov	Minimum proportion of gene covered	No	80
--minid	Minimum proportion of exact nucleotide matches	No	90
--R	Remove genes present in all genomes from the matrix	No	False

Test

After installing ABRicate and Pandas and seaborn you can test the script with the command line :

Default database

python AntiViruce.py -f input_file.tsv --defaultdb vfdb -r results_directory --minid 90 --mincov 80

Private database

python AntiViruce.py -f input_file.tsv  --privatedb private_db_name -T 10 -r results_directory --minid 90 --mincov 80 -dbp path_to_abricate_databases_repertory -dbf private_db_multifasta_path

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: GNU General Public License v2 (GPLv2)
Operating System
- POSIX :: Linux
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

0.9.0

Feb 12, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

GENIALbiologists-0.9.0.tar.gz (12.8 kB view details)

Uploaded Feb 12, 2019 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

GENIALbiologists-0.9.0-py3-none-any.whl (27.0 kB view details)

Uploaded Feb 12, 2019 Python 3

File details

Details for the file GENIALbiologists-0.9.0.tar.gz.

File metadata

Download URL: GENIALbiologists-0.9.0.tar.gz
Upload date: Feb 12, 2019
Size: 12.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.8

File hashes

Hashes for GENIALbiologists-0.9.0.tar.gz
Algorithm	Hash digest
SHA256	`9376d696e78349e342ea88eb6b391eb05dea0e2d0e51cdfe3f352cff64586eb4`
MD5	`718785ea87d2f40e933642c83a33c66e`
BLAKE2b-256	`7565042d99504140e45fae8e24004d93dc3db862bd8a83063ce7452ac1b3200b`

See more details on using hashes here.

File details

Details for the file GENIALbiologists-0.9.0-py3-none-any.whl.

File metadata

Download URL: GENIALbiologists-0.9.0-py3-none-any.whl
Upload date: Feb 12, 2019
Size: 27.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.8

File hashes

Hashes for GENIALbiologists-0.9.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`20cb8d33ebef6a8240b4d61edd89ffa384c9eb422cfe0c3a9bf767039bd3b9b6`
MD5	`4d7e8f5ffaab55009d52267fddaac81a`
BLAKE2b-256	`3f2123a3db9ceef6171f0a4fe3744c25ef87e24c71291a40a3e3c88ed6f9c813`

See more details on using hashes here.

GENIALbiologists 0.9.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

GENIAL : GENes Identification with Abricate for Lucky biologists

GENIAL

Databases

GENIALanalysis

GENIALresults

Dependencies

External dependencies

Parameters

Command line options

Test

Default database

Private database

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes