busco analysis for gene predictions
Project description
BUSCOlite: simplified BUSCO analysis for gene prediction
BUSCOlite can run the miniprot/Augustus mediated genome predictions as well as the pyhmmer HMM predictions using the BUSCO v9 or v10 databases. It also provides a python API to run busco analysis from within python, ie to be used inside the eukaryotic gene prediction pipeline Funannotate.
This tool is not meant to be a replacment of BUSCO, for most general use cases you should continue to use BUSCOv5
BUSCO models/lineages can be downloaded from the BUSCO site: v5, v4. BUSCOlite does not provide an internal method to do this, as it is trivial to download the lineage you need from your organism(s) by following these links.
There are limited dependencies with BUSCOlite:
- augustus (note: many versions on conda have non-functional PPX/--proteinprofile mode)
- miniprot
- pyhmmer
- pyfastx
- natsort
Why?
Funannotate uses BUSCO to find core conserved marker genes that it uses as a basis to train several ab-initio gene predictors. When BUSCO v2 came out it was python3 only and at that time funannotate was still python2, so I modified the BUSCOv2 source code to be compatible with python2 so it could be run within funannotate. Now BUSCOv5 is the current release, that has numerous bells and whistles that funannotate does not need (no knock against bells and whistles) but the real problem is that due to the large number of dependencies associated with these extra tools is that I cannot build a conda image that includes funannotate and BUSCOv5. So I re-wrote BUSCOv2 here so that it has limited dependencies and will make it easier to incorporate as a dependency of funannotate. A side note is that the metaeuk
method that BUSCOv5 now uses as default does not produce complete gene models, in fact the protein sequences it outputs have lowercase sequences that are actually not found in your genome at all. So for training ab-initio predictors, the metaeuk
method is not useful -- however, it is faster to get your simple stats on "how complete is my genome assembly".
To install release versions use the pip package manager, like so:
python -m pip install buscolite
To install the most updated code in master you can run:
python -m pip install git+https://github.com/nextgenusfs/buscolite.git
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file buscolite-24.11.3.tar.gz
.
File metadata
- Download URL: buscolite-24.11.3.tar.gz
- Upload date:
- Size: 132.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f7ef8151b2dd16848ec8bf7e14658a81db3fbe7898e6377a2bb0553cf065f47b |
|
MD5 | 02bc61fba0d780d91ee91cf62033fd35 |
|
BLAKE2b-256 | 434efca67ec4bb2df8eabdf74f6c21ae27c5570e7eb3821363615435f8d5f155 |
Provenance
The following attestation bundles were made for buscolite-24.11.3.tar.gz
:
Publisher:
python-publish.yml
on nextgenusfs/buscolite
-
Statement type:
https://in-toto.io/Statement/v1
- Predicate type:
https://docs.pypi.org/attestations/publish/v1
- Subject name:
buscolite-24.11.3.tar.gz
- Subject digest:
f7ef8151b2dd16848ec8bf7e14658a81db3fbe7898e6377a2bb0553cf065f47b
- Sigstore transparency entry: 146359880
- Sigstore integration time:
- Predicate type:
File details
Details for the file buscolite-24.11.3-py3-none-any.whl
.
File metadata
- Download URL: buscolite-24.11.3-py3-none-any.whl
- Upload date:
- Size: 142.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 270c2cf199a05f4348b0dc57c773bc8d8e7254518c8a36314a818ff8713d3e21 |
|
MD5 | 42b397e7272531bce00047973cf32605 |
|
BLAKE2b-256 | 792be5d15a71b3758a42cc27fe393c5873435e33a3eb1897ff32833f2c064991 |
Provenance
The following attestation bundles were made for buscolite-24.11.3-py3-none-any.whl
:
Publisher:
python-publish.yml
on nextgenusfs/buscolite
-
Statement type:
https://in-toto.io/Statement/v1
- Predicate type:
https://docs.pypi.org/attestations/publish/v1
- Subject name:
buscolite-24.11.3-py3-none-any.whl
- Subject digest:
270c2cf199a05f4348b0dc57c773bc8d8e7254518c8a36314a818ff8713d3e21
- Sigstore transparency entry: 146359881
- Sigstore integration time:
- Predicate type: