Skip to main content

Reproducible bundles for EBP genome assemblies

Project description

genomebundle

genomebundle bundles metadata and files for Earth BioGenome Project (EBP) genome assemblies into a single, reproducible package.

The core idea is simple: when you use a genome assembly in your research, you should be able to document exactly what you downloaded, when, and from where — with checksums. genomebundle does this by aggregating data from three sources into a machine-readable manifest.json:

  • GoaT (Genomes on a Tree) — taxonomy and cross-references
  • NCBI Datasets — assembly statistics and FTP file URLs
  • BlobToolKit — BUSCO completeness results (assembly quality metrics)

This makes it easier to cite the data precisely and to keep pipelines reproducible across time.

Installation

pip install genomebundle

Basic CLI usage

# Download FASTA and GFF
genomebundle fetch GCF_040938575.1 --files fasta,gff

# Download all associated files
genomebundle fetch GCF_040938575.1 --files all

# Build manifest only (no download)
genomebundle fetch GCF_040938575.1 --no-download

# Verify checksums of a downloaded bundle
genomebundle verify ./GCF_040938575.1/

# Print manifest of an existing bundle
genomebundle show ./GCF_040938575.1/

Python API

from genomebundle import fetch_assembly, fetch_assembly_report, fetch_busco

goat = fetch_assembly("GCF_040938575.1")
ncbi = fetch_assembly_report("GCF_040938575.1")
btk  = fetch_busco("GCF_040938575.1")

Output

Each bundle contains:

  • manifest.json — machine-readable, includes SHA256 checksums and source URLs
  • README.txt — human-readable summary
  • downloaded files (optional)

References

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

genomebundle-0.1.3.tar.gz (14.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

genomebundle-0.1.3-py3-none-any.whl (12.1 kB view details)

Uploaded Python 3

File details

Details for the file genomebundle-0.1.3.tar.gz.

File metadata

  • Download URL: genomebundle-0.1.3.tar.gz
  • Upload date:
  • Size: 14.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.7

File hashes

Hashes for genomebundle-0.1.3.tar.gz
Algorithm Hash digest
SHA256 5c8856f7a7b8602f88140226ea973a80a4dd49e434753dee498acf09407a33b4
MD5 451c53c1638a135dfe0d20a057ce67b0
BLAKE2b-256 94ec2c4f4b1810c0801edf9f6c39c9811d4d414ac562a99afd6dc2b46f962ae8

See more details on using hashes here.

File details

Details for the file genomebundle-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for genomebundle-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 19fc2d257f812f2500732b59ffe1b29c07f5da23ce0a1520f3c0f3fb5970bf20
MD5 764331e413036f1e61ce82a1f3af52ca
BLAKE2b-256 0b92d84308f3e7904dd5601bfb63d51d58bf7865f12dd168545294b4a5429162

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page