Skip to main content

Reproducible bundles for EBP genome assemblies

Project description

genomebundle

genomebundle bundles metadata and files for Earth BioGenome Project (EBP) genome assemblies into a single, reproducible package.

The core idea is simple: when you use a genome assembly in your research, you should be able to document exactly what you downloaded, when, and from where — with checksums. genomebundle does this by aggregating data from three sources into a machine-readable manifest.json:

  • GoaT (Genomes on a Tree) — taxonomy and cross-references
  • NCBI Datasets — assembly statistics and FTP file URLs
  • BlobToolKit — BUSCO completeness results (assembly quality metrics)

This makes it easier to cite the data precisely and to keep pipelines reproducible across time.

Installation

pip install genomebundle

Basic CLI usage

# Download FASTA and GFF
genomebundle fetch GCF_040938575.1 --files fasta,gff

# Download all associated files
genomebundle fetch GCF_040938575.1 --files all

# Build manifest only (no download)
genomebundle fetch GCF_040938575.1 --no-download

# Verify checksums of a downloaded bundle
genomebundle verify ./GCF_040938575.1/

# Print manifest of an existing bundle
genomebundle show ./GCF_040938575.1/

Python API

from genomebundle import fetch_assembly, fetch_assembly_report, fetch_busco

goat = fetch_assembly("GCF_040938575.1")
ncbi = fetch_assembly_report("GCF_040938575.1")
btk  = fetch_busco("GCF_040938575.1")

Output

Each bundle contains:

  • manifest.json — machine-readable, includes SHA256 checksums and source URLs
  • README.txt — human-readable summary
  • downloaded files (optional)

References

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

genomebundle-0.1.0.tar.gz (11.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

genomebundle-0.1.0-py3-none-any.whl (11.6 kB view details)

Uploaded Python 3

File details

Details for the file genomebundle-0.1.0.tar.gz.

File metadata

  • Download URL: genomebundle-0.1.0.tar.gz
  • Upload date:
  • Size: 11.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.7

File hashes

Hashes for genomebundle-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4ddffd946da1aabcfa6518733bd96a247f6f01697750bda8fa3749eadcd2c828
MD5 7d95fe1caade20e43130b7fc4fc47f37
BLAKE2b-256 07a773f53581d472ff4b396f3d715165ba029ccd8bca8ba2960150f4546bcaea

See more details on using hashes here.

File details

Details for the file genomebundle-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for genomebundle-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9e07c86e9d9951a988bf3aa50363a4a700bbb3d37ffb38d32d9968959bac8b49
MD5 07441070f4b72f13d05b0c5eb708e632
BLAKE2b-256 796d396d25e4696184e9a3e205966b656e2c7b90847f8d61a0c4d36ab02034fb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page