Genomic Selection Model Benchmarking CLI for Plant Breeding

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Project description

gsbench

Genomic Selection Model Benchmarking CLI for Plant Breeding

gsbench cross-validates genomic selection models on your genotype/phenotype data and produces a comparison report with prediction accuracy, bias diagnostics, and plots — from a single command.

Installation

pip install gsbench

With gradient-boosting models (XGBoost, LightGBM):

pip install gsbench[full]

From source, for development:

git clone https://github.com/josh45-source/gsbench.git
cd gsbench
pip install -e ".[dev]"

Quick Start

gsbench ships with a small simulated example dataset (100 samples x 500 markers, two traits) so you can try it immediately:

# Copy the example genotype/phenotype files to the current directory
gsbench example

# Benchmark models on the example data
gsbench run example_geno.csv example_pheno.csv --trait yield --models GBLUP,BRR,RF --folds 5

This writes gsbench_output/report.html with the full comparison report, gsbench_output/summary.csv, and diagnostic plots under gsbench_output/plots/.

CLI Reference

`gsbench run`

gsbench run GENO PHENO --trait TRAIT [OPTIONS]

Argument / Option	Default	Description
`GENO`	—	Path to the genotype file (CSV/TSV, HapMap, or numeric matrix; format auto-detected)
`PHENO`	—	Path to the phenotype file (CSV/TSV, first column = sample IDs)
`--trait`	—	Phenotype column to benchmark against (required)
`--models`	`all`	`all`, or a comma-separated list of abbreviations, e.g. `GBLUP,BRR,RF`
`--folds`	`5`	Number of cross-validation folds
`--repeats`	`1`	Number of times to repeat k-fold CV (uses `RepeatedKFold` when > 1)
`--maf`	`0.05`	Minimum minor allele frequency; markers below this are dropped
`--max-missing`	`0.2`	Maximum per-marker missingness fraction before a marker is dropped
`--impute`	`mean`	Missing-genotype imputation: `mean` or `median`
`--scale`	`center`	Genotype scaling: `center`, `standardize`, or `none`
`--output`	`gsbench_output`	Output directory for the report, summary CSV, and plots
`--seed`	`42`	Random seed for cross-validation splits
`--format`	`auto`	Genotype format override: `auto`, `csv`, `tsv`, `hapmap`, `numeric`

`gsbench list-models`

Prints a table of all registered models and whether their dependencies are installed.

`gsbench example`

gsbench example [--output DIR]

Copies the bundled example genotype/phenotype CSVs into DIR (defaults to the current directory) and prints the gsbench run command to benchmark them.

Models

Abbreviation	Model	Notes
GBLUP	Genomic BLUP	Kernel ridge regression on the genomic relationship matrix `G = ZZ'/p`
BRR	Bayesian Ridge Regression	`sklearn.linear_model.BayesianRidge` on marker dosages
BL	Bayesian LASSO	`sklearn.linear_model.ARDRegression`, a sparse approximation of BayesB/BayesC
RKHS	RKHS (Gaussian Kernel)	Kernel ridge regression with an RBF kernel; bandwidth chosen by internal CV
RF	Random Forest	`sklearn.ensemble.RandomForestRegressor` (500 trees)
XGB	XGBoost	Requires `pip install gsbench[full]`
LGBM	LightGBM	Requires `pip install gsbench[full]`

Every model implements the same two-method interface (fit / predict), so adding a new one is a matter of subclassing gsbench.models.base.GSModel.

Metrics

Each fold reports r (Pearson correlation), r2, rmse, mae, bias, slope (regression of observed on predicted — should be ~1), spearman (rank correlation), and nrmse. Breeders care most about r (prediction accuracy) and spearman (does the model rank genotypes correctly for selection?).

Example Output

Model comparison (prediction accuracy per model, with fold-to-fold error bars):

Model comparison barplot

Predicted vs. observed phenotypes per model:

Predicted vs observed

The full HTML report also includes a boxplot of per-fold accuracy, a bias diagnostic, a runtime comparison, and per-model detail tables.

Companion Tools

gsbench is part of a small plant-breeding data pipeline:

brapiR2 — pull data from BrAPI servers
phenoQC — QC for phenotypic trial data
vcf2dosage — VCF to dosage matrix conversion
gsbench — benchmark genomic selection models

Pipeline: retrieve → clean → prepare genotypes → benchmark models

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ayojoashjoshua

Release history Release notifications | RSS feed

This version

0.1.0

Jul 2, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gsbench-0.1.0.tar.gz (41.0 kB view details)

Uploaded Jul 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

gsbench-0.1.0-py3-none-any.whl (36.9 kB view details)

Uploaded Jul 2, 2026 Python 3

File details

Details for the file gsbench-0.1.0.tar.gz.

File metadata

Download URL: gsbench-0.1.0.tar.gz
Upload date: Jul 2, 2026
Size: 41.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gsbench-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`9bcb8d5ae85d7d1950ff90acf270b3c6c6ff49df80ef378e77d6d4078d69c0ec`
MD5	`e769617c8540b0936e1d86c3c5aedb70`
BLAKE2b-256	`ae2a61d12f3f8c78225fcdc5dc5453218be0f77125379ec6e4c493fa45ce66b5`

See more details on using hashes here.

Provenance

The following attestation bundles were made for gsbench-0.1.0.tar.gz:

Publisher: publish.yaml on josh45-source/gsbench

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: gsbench-0.1.0.tar.gz
- Subject digest: 9bcb8d5ae85d7d1950ff90acf270b3c6c6ff49df80ef378e77d6d4078d69c0ec
- Sigstore transparency entry: 2048188200
- Sigstore integration time: Jul 2, 2026
Source repository:
- Permalink: josh45-source/gsbench@9396c8421943b845610b2d26b2fdee681d5f73ae
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/josh45-source
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yaml@9396c8421943b845610b2d26b2fdee681d5f73ae
- Trigger Event: release

File details

Details for the file gsbench-0.1.0-py3-none-any.whl.

File metadata

Download URL: gsbench-0.1.0-py3-none-any.whl
Upload date: Jul 2, 2026
Size: 36.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gsbench-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8a079652bbd4ad47a0b8b41beea284eb22643e49bda9494791ba693259ea9d18`
MD5	`5a5bfd05eb84f37062e13903fbd785ab`
BLAKE2b-256	`e539c8aaa9c9daa8a8dece6a20eb71b734ff6da69c5996d42abe15ca31c309a7`

See more details on using hashes here.

Provenance

The following attestation bundles were made for gsbench-0.1.0-py3-none-any.whl:

Publisher: publish.yaml on josh45-source/gsbench

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: gsbench-0.1.0-py3-none-any.whl
- Subject digest: 8a079652bbd4ad47a0b8b41beea284eb22643e49bda9494791ba693259ea9d18
- Sigstore transparency entry: 2048188428
- Sigstore integration time: Jul 2, 2026
Source repository:
- Permalink: josh45-source/gsbench@9396c8421943b845610b2d26b2fdee681d5f73ae
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/josh45-source
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yaml@9396c8421943b845610b2d26b2fdee681d5f73ae
- Trigger Event: release

gsbench 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Project description

gsbench

Installation

Quick Start

CLI Reference

`gsbench run`

`gsbench list-models`

`gsbench example`

Models

Metrics

Example Output

Companion Tools

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance