Skip to main content

busco analysis for gene predictions

Project description

Latest Github release Conda Tests codecov

BUSCOlite: simplified BUSCO analysis for gene prediction

BUSCOlite can run the miniprot/Augustus mediated genome predictions as well as the pyhmmer HMM predictions using the BUSCO v9, v10, or v12 databases. It also provides a python API to run busco analysis from within python, ie to be used inside the eukaryotic gene prediction pipeline Funannotate.

This tool is not meant to be a replacment of BUSCO, for most general use cases you should continue to use BUSCO

BUSCO models/lineages can be downloaded from the BUSCO site: v5, v4. BUSCOlite does not provide an internal method to do this, as it is trivial to download the lineage you need from your organism(s) by following these links.

There are limited dependencies with BUSCOlite:
Features:
  • Genome and protein mode analysis: Run BUSCO on genome assemblies or protein sets
  • BUSCO v6-compatible filtering: Implements the same filtering logic as BUSCO v6 for accurate results
  • Publication-quality plots: Generate SVG plots from results with zero additional dependencies
  • Multi-sample comparison: Compare multiple BUSCO results in a single plot
  • Python API: Use BUSCOlite programmatically in your own scripts
  • Lightweight: Minimal dependencies, easy to install and integrate

Why?

Funannotate uses BUSCO to find core conserved marker genes that it uses as a basis to train several ab-initio gene predictors. When BUSCO v2 came out it was python3 only and at that time funannotate was still python2, so I modified the BUSCOv2 source code to be compatible with python2 so it could be run within funannotate. Now BUSCOv5 is the current release, that has numerous bells and whistles that funannotate does not need (no knock against bells and whistles) but the real problem is that due to the large number of dependencies associated with these extra tools is that I cannot build a conda image that includes funannotate and BUSCOv5. So I re-wrote BUSCOv2 here so that it has limited dependencies and will make it easier to incorporate as a dependency of funannotate. A side note is that the metaeuk method that BUSCOv5 now uses as default does not produce complete gene models, in fact the protein sequences it outputs have lowercase sequences that are actually not found in your genome at all. So for training ab-initio predictors, the metaeuk method is not useful -- however, it is faster to get your simple stats on "how complete is my genome assembly".

To install release versions use the pip package manager, like so:

python -m pip install buscolite

To install the most updated code in master you can run:

python -m pip install git+https://github.com/nextgenusfs/buscolite.git

Quick Start

Run BUSCO analysis on a genome:

buscolite -i genome.fasta -o mygenome -m genome -l /path/to/fungi_odb12 -c 8

Generate a plot from the results:

buscolite-plot mygenome.buscolite.json -o mygenome_plot.svg

Compare multiple samples:

buscolite-plot sample1.buscolite.json sample2.buscolite.json sample3.buscolite.json -o comparison.svg

For detailed usage instructions, see the Usage Guide.

Development

If you want to contribute to the development of BUSCOlite, follow these steps:

  1. Clone the repository:

    git clone https://github.com/nextgenusfs/buscolite.git
    cd buscolite
    
  2. Set up the development environment:

    ./scripts/setup_dev.sh
    

    This will install the development dependencies and set up pre-commit hooks.

  3. Make your changes and commit them. The pre-commit hooks will automatically check and format your code.

  4. Run the tests to make sure everything is working:

    pytest
    

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

buscolite-26.1.26.tar.gz (143.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

buscolite-26.1.26-py3-none-any.whl (155.1 kB view details)

Uploaded Python 3

File details

Details for the file buscolite-26.1.26.tar.gz.

File metadata

  • Download URL: buscolite-26.1.26.tar.gz
  • Upload date:
  • Size: 143.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for buscolite-26.1.26.tar.gz
Algorithm Hash digest
SHA256 3c7462c5cb75fb21259b2c1a6ee4ec61ea4bc36491b492d83c0a355bfea94dc7
MD5 70d1e4b2fee3ff6451a36ac2929f8b11
BLAKE2b-256 5765fc73af02e1b2199b410b8122284dec4f4e40df8d45933423faa628087ac8

See more details on using hashes here.

Provenance

The following attestation bundles were made for buscolite-26.1.26.tar.gz:

Publisher: python-publish.yml on nextgenusfs/buscolite

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file buscolite-26.1.26-py3-none-any.whl.

File metadata

  • Download URL: buscolite-26.1.26-py3-none-any.whl
  • Upload date:
  • Size: 155.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for buscolite-26.1.26-py3-none-any.whl
Algorithm Hash digest
SHA256 79bf7aacdeb54e4001aa762f6221ec47b11b246105b44671ebbf605eaddb7c6d
MD5 005c26931af8f87e62876545cf2d6627
BLAKE2b-256 e84d28b2f9945a1d395e5ec825ebb8150f0b4ce57b7a4a6319fee8fd6d040513

See more details on using hashes here.

Provenance

The following attestation bundles were made for buscolite-26.1.26-py3-none-any.whl:

Publisher: python-publish.yml on nextgenusfs/buscolite

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page