Gene cluster prediction with Conditional random fields.
Hi, I'm GECCO!
GECCO (Gene Cluster prediction with Conditional Random Fields) is a fast and scalable method for identifying putative novel Biosynthetic Gene Clusters (BGCs) in genomic and metagenomic data using Conditional Random Fields (CRFs).
🔧 Installing GECCO
pip to install GECCO on your
$ pip install gecco-tool
$ conda install -c bioconda gecco
This will install GECCO, its dependencies, and the data needed to run
predictions. This requires around 100MB of data to be downloaded, so
it could take some time depending on your Internet connection. Once done, you
will have a
gecco command available in your $PATH.
Note that GECCO uses HMMER3, which can only run on PowerPC and recent x86-64 machines running a POSIX operating system. Therefore, Linux and OSX are supported platforms, but GECCO will not be able to run on Windows.
🧬 Running GECCO
gecco is installed, you can run it from the terminal by giving it a
FASTA or GenBank file with the genomic sequence you want to analyze, as
well as an output directory:
$ gecco run --genome some_genome.fna -o some_output_dir
Additional parameters of interest are:
--jobs, which controls the number of threads that will be spawned by GECCO whenever a step can be parallelized. The default, 0, will autodetect the number of CPUs on the machine using
--cds, controlling the minimum number of consecutive genes a BGC region must have to be detected by GECCO (default is 3).
--threshold, controlling the minimum probability for a gene to be considered part of a BGC region. Using a lower number will increase the number (and possibly length) of predictions, but reduce accuracy.
GECCO can be cited using the following preprint:
Accurate de novo identification of biosynthetic gene clusters with GECCO. Laura M Carroll, Martin Larralde, Jonas Simon Fleck, Ruby Ponnudurai, Alessio Milanese, Elisa Cappio Barazzone, Georg Zeller. bioRxiv 2021.05.03.442509; doi:10.1101/2021.05.03.442509
⚠️ Issue Tracker
Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.
Contributions are more than welcome! See
for more details.
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size gecco_tool-0.7.0-py2.py3-none-any.whl (98.9 MB)||File type Wheel||Python version py2.py3||Upload date||Hashes View|
|Filename, size gecco-tool-0.7.0.tar.gz (1.3 MB)||File type Source||Python version None||Upload date||Hashes View|
Hashes for gecco_tool-0.7.0-py2.py3-none-any.whl