Skip to main content

High-performance Multi-method Mixed-Model Association for large-scale GWAS

Project description

CI PyPI Python 3.11+ JAX NumPy Hypothesis License: GPL-3.0 Buy Me a Coffee

JAMMA

JAMMA (High-performance Multi-method Mixed-Model Association) — a modern Python and C reimplementation of GEMMA for large-scale GWAS.

  • GEMMA-compatible: Drop-in replacement with identical CLI flags and output formats
  • Numerical equivalence: Validated against GEMMA — 100% significance agreement, 100% effect direction agreement
  • Fast: Up to 14x faster than GEMMA 0.98.5 at scale
  • Memory-safe: Pre-flight memory checks prevent OOM crashes before allocation
  • Cross-platform: Runs on Linux, macOS, and Windows — NumPy backend works everywhere, JAX adds batch acceleration on Linux and ARM Mac
  • Optimized for Intel: Best performance on Intel CPUs with MKL BLAS. Runs well on Apple Silicon (Accelerate BLAS). Other architectures (AMD, ARM Linux) work correctly but with less BLAS optimization
  • Pure Python + optional C extensions: NumPy + optional JAX stack; C extensions for DSYEVR eigendecomposition and OpenMP-parallel Wald tests, JAX for batch MLE optimization
  • Large-scale ready: Optional numpy-mkl ILP64 wheels (numpy 2.4.2) for >46k sample eigendecomposition

Installation

macOS (Intel or ARM)

pip install jamma          # NumPy backend
pip install 'jamma[jax]'   # + JAX acceleration (ARM Mac only)

That's it. macOS Accelerate BLAS handles large matrices natively.

Linux / Windows / Intel Mac

For small datasets (<46k samples), the standard install works:

pip install jamma          # NumPy backend
pip install 'jamma[jax]'   # + JAX acceleration

For large-scale GWAS (>46k samples) on x86_64 (Linux or Intel Mac), install numpy-mkl first — standard numpy uses 32-bit BLAS integers which overflow at ~46k samples. MKL is x86_64-only; ARM Mac and Windows users are limited to <46k samples. Pre-built ILP64 wheels are available for Python 3.11–3.14:

NumPy backend only:

pip install numpy \
  --extra-index-url https://michael-denyer.github.io/numpy-mkl \
  --force-reinstall --upgrade
pip install jamma --no-deps
pip install psutil loguru threadpoolctl click progressbar2 bed-reader

With JAX acceleration:

pip install numpy \
  --extra-index-url https://michael-denyer.github.io/numpy-mkl \
  --force-reinstall --upgrade
pip install 'jamma[jax]' --no-deps
pip install psutil loguru threadpoolctl click progressbar2 bed-reader \
  jax jaxlib jaxtyping

From Git (latest development version):

pip install numpy \
  --extra-index-url https://michael-denyer.github.io/numpy-mkl \
  --force-reinstall --upgrade
pip install git+https://github.com/michael-denyer/jamma.git --no-deps
pip install psutil loguru threadpoolctl click progressbar2 bed-reader

Why --no-deps? JAMMA depends on numpy>=2.0.0, so a normal pip install jamma will pull in standard numpy and overwrite the ILP64 build. --no-deps prevents this; you install the runtime dependencies manually instead.

See the User Guide for ILP64 verification steps.

Platform Support

Platform pip install jamma pip install jamma[jax] BLAS Notes
Linux x86_64 (Intel) JAX (auto-included) MKL (optimal) Best performance; ILP64 for >46k samples
Linux x86_64 (AMD) JAX (auto-included) OpenBLAS Works well; MKL also works on AMD but less optimized
ARM Mac (M1+) JAX (auto-included) Accelerate Excellent performance via Apple's BLAS
ARM Linux NumPy only JAX manual install OpenBLAS Works correctly; less BLAS optimization
Intel Mac NumPy only Not available MKL / Accelerate JAX dropped Intel Mac; ILP64 for >46k samples
Windows NumPy only Not available OpenBLAS JAX dropped Windows support

JAMMA's heavy computation (eigendecomposition, matrix multiplication, REML optimization) is BLAS-bound. Intel MKL delivers the best throughput, particularly at scale. Apple Accelerate is a close second on Apple Silicon. OpenBLAS works correctly everywhere but is less tuned for these workloads.

JAX is auto-included on Linux and ARM Mac via platform markers. Force a specific backend with --backend numpy or --backend jax.

Quick Start

# Compute kinship matrix (centered relatedness)
jamma -gk 1 -bfile data/my_study -o output
# Output: output/output.cXX.npy (binary, fast)
# Add --legacy-text for GEMMA-compatible text format

# Run LMM association (Wald test)
jamma -lmm 1 -bfile data/my_study -k output/output.cXX.npy -o results

# Multiple phenotypes (eigendecomp computed once, reused)
jamma -lmm 1 -bfile data/my_study -k output/output.cXX.npy -n "1 2 3" -o results

Output files:

  • output.cXX.npy — Kinship matrix (binary NumPy format; .cXX.txt with --legacy-text)
  • results.assoc.txt — Association results (chr, rs, ps, n_miss, allele1, allele0, af, beta, se, logl_H1, l_remle, p_wald)
  • results.log.txt — Run log

The reader auto-detects format, so existing .cXX.txt files still work as -k input.

Python API

One-call GWAS (recommended)

The gwas() function is the recommended way to run JAMMA from Python. It handles the full pipeline — data loading, kinship computation, eigendecomposition, and LMM association — in a single call. You don't need to compute a kinship matrix separately unless you want to reuse it across runs.

from jamma import gwas

# Simplest usage: computes kinship internally, no separate kinship step needed
result = gwas("data/my_study")
print(f"Tested {result.n_snps_tested} SNPs in {result.timing['total_s']:.1f}s")

# Or supply a pre-computed kinship matrix to skip recomputation
result = gwas("data/my_study", kinship_file="data/kinship.cXX.npy")

# Compute kinship from scratch and save it for reuse
result = gwas("data/my_study", save_kinship=True, output_dir="output")

# With covariates and LRT test
result = gwas("data/my_study", kinship_file="k.txt", covariate_file="covars.txt", lmm_mode=2)

# LOCO analysis (leave-one-chromosome-out)
result = gwas("data/my_study", loco=True)

# LOCO with eigen caching (skip eigendecomp on subsequent runs)
result = gwas("data/my_study", loco=True, write_eigen=True, eigen_dir="output/eigen")
result = gwas("data/my_study", loco=True, eigen_dir="output/eigen")  # reuses cache

# Multi-phenotype with eigendecomp reuse (Python API)
result = gwas("data/my_study", write_eigen=True, phenotype_column=1)
result = gwas("data/my_study", eigenvalue_file="output/result.eigenD.npy",
              eigenvector_file="output/result.eigenU.npy", phenotype_column=2)
# Or use the CLI for automatic multi-phenotype: jamma -lmm 1 ... -n "1 2 3"

# SNP filtering
result = gwas("data/my_study", kinship_file="k.txt", snps_file="snps.txt", hwe=0.001)

Low-level API (JAX backend)

import numpy as np

from jamma.io import load_plink_binary
from jamma.kinship import compute_centered_kinship
from jamma.lmm import run_lmm_association_streaming
from jamma.lmm.eigen import eigendecompose_kinship

# Load PLINK data and phenotypes
data = load_plink_binary("data/my_study")
phenotypes = np.loadtxt("data/my_study.pheno")  # loaded separately from .fam or phenotype file

# Compute kinship and eigendecompose (treat kinship as consumed after this)
kinship = compute_centered_kinship(data.genotypes)
eigenvalues, eigenvectors = eigendecompose_kinship(kinship)

# Run association (streaming from disk)
results, n_tested = run_lmm_association_streaming(
    bed_path="data/my_study",
    phenotypes=phenotypes,
    eigenvalues=eigenvalues,
    eigenvectors=eigenvectors,
    chunk_size=5000,
)

Low-level API (NumPy backend)

import numpy as np

from jamma.io import load_plink_binary
from jamma.kinship import compute_centered_kinship
from jamma.lmm import run_lmm_association_numpy
from jamma.lmm.eigen import eigendecompose_kinship

data = load_plink_binary("data/my_study")
phenotypes = np.loadtxt("data/my_study.pheno")
kinship = compute_centered_kinship(data.genotypes)
eigenvalues, eigenvectors = eigendecompose_kinship(kinship)

snp_info = [
    {"chr": str(data.chromosome[i]), "rs": data.sid[i],
     "pos": int(data.bp_position[i]), "a1": data.allele_1[i], "a0": data.allele_2[i]}
    for i in range(data.n_snps)
]

# Returns LmmRunResult — .associations for list[AssocResult], .pve for heritability, .pve_se for SE
run_result = run_lmm_association_numpy(
    genotypes=data.genotypes,
    phenotypes=phenotypes,
    kinship=None,  # Not needed when eigenvalues/eigenvectors provided
    snp_info=snp_info,
    eigenvalues=eigenvalues,
    eigenvectors=eigenvectors,
    lmm_mode=1,
)
results = run_result.associations

Memory Safety

Unlike GEMMA, JAMMA includes pre-flight memory checks that prevent out-of-memory crashes:

from jamma.core.memory import estimate_workflow_memory

# Check memory requirements BEFORE loading data
estimate = estimate_workflow_memory(n_samples=200_000, n_snps=95_000)
print(f"Peak memory: {estimate.total_gb:.1f}GB")
print(f"Available: {estimate.available_gb:.1f}GB")
print(f"Sufficient: {estimate.sufficient}")

Key features:

  • Pre-flight checks before large allocations (eigendecomposition, genotype loading)
  • RSS memory logging at workflow boundaries
  • Incremental result writing (no memory accumulation)
  • Safe chunk size defaults with hard caps

GEMMA will silently OOM and get killed by the OS. JAMMA fails fast with clear error messages.

Performance

Benchmark on mouse_hs1940 (1,940 samples × 12,226 SNPs), Apple M2 (AC power), GEMMA 0.98.5. Best-of runs, end-to-end wall clock:

Operation GEMMA 0.98.5 JAMMA NumPy JAMMA NumPy+C JAMMA JAX (batch) JAMMA JAX (streaming) C speedup vs GEMMA
Kinship (-gk 1) 2.2s 262ms 262ms 1.0x 8.5x
LMM Wald (-lmm 1) 11.3s 4.2s 1.2s 2.1s 2.6s 3.4x 9.2x
LMM All (-lmm 4) 20.7s 6.0s 1.4s 2.8s 4.2s 4.2x 14.3x
LMM Wald+4cov (-lmm 1 -c) 41.7s 12.0s 4.6s 4.7s 7.4s 2.6x 9.0x

NumPy+C uses a C extension with OpenMP for Wald (-lmm 1) — REML optimization is compute-bound and parallelizes well across SNPs. The C speedup grows with covariates (2.6x with 4 covariates) because the Pab table recursion is more expensive. NumPy+C is now the fastest backend at all modes including all-tests (-lmm 4) at mouse scale (14.3x vs GEMMA). JAX (batch) uses jax.vmap batching for MLE optimization and is competitive on -lmm 4. JAX (streaming) reads genotypes from disk in chunks and is the production code path for large datasets that don't fit in memory. Kinship is always pure NumPy/BLAS regardless of backend.

LOCO (Leave-One-Chromosome-Out)

Backend LOCO Wald vs GEMMA
GEMMA 0.98.5 3m36s 1.0x
JAMMA NumPy+C 7.7s 28.2x
JAMMA JAX 13.0s 16.6x

The large speedup has two sources: (1) JAMMA computes per-chromosome LOCO kinship via streaming and tests only that chromosome's SNPs, while GEMMA -loco tests all SNPs against each LOCO kinship (19× redundant work on 19 chromosomes); (2) JAMMA runs all chromosomes in a single process, avoiding 19 cold-start overheads. On this dataset, NumPy+C is faster than JAX because the JIT compilation overhead per chromosome outweighs XLA's compute benefit at 1,940 samples.

Supported Features

Current

  • Kinship matrix computation — centered (-gk 1) and standardized (-gk 2)
  • Univariate LMM Wald test (-lmm 1)
  • Likelihood ratio test (-lmm 2)
  • Score test (-lmm 3)
  • All tests mode (-lmm 4)
  • LOCO kinship — leave-one-chromosome-out analysis (-loco)
  • Binary .npy I/O — default for kinship and eigen files; --legacy-text for GEMMA text format
  • Multi-phenotype support — -n "1 2 3" with single eigendecomposition reuse
  • Eigendecomposition reuse — manual via -d/-u/-eigen, automatic in multi-phenotype mode
  • LOCO eigen caching — --eigen-dir saves/loads per-chromosome eigen files across runs
  • Phenotype column selection (-n)
  • SNP subset selection for association and kinship (-snps/-ksnps)
  • HWE QC filtering (-hwe)
  • Pre-computed kinship input (-k)
  • Covariate support (-c)
  • PLINK binary format (.bed/.bim/.fam) with input dimension validation
  • Large-scale streaming I/O (>100k samples via numpy-mkl ILP64 — numpy 2.4.2)
  • JAX acceleration (CPU) with automatic device sharding
  • XLA profiling traces (--profile-dir) for TensorBoard/Perfetto
  • Lambda optimization bounds (-lmin/-lmax)
  • Individual weights for kinship (-widv)
  • Categorical covariates with one-hot encoding (-cat)
  • Pre-flight memory checks (fail-fast before OOM)
  • RSS memory logging at workflow boundaries
  • Incremental result writing
  • Optional C extensions: DSYEVR eigendecomposition (O(n) workspace, enables >100k samples) and OpenMP-parallel Wald tests (auto-fallback to pure Python)

Planned

  • Multivariate LMM (mvLMM)

Architecture

JAMMA uses NumPy for data loading and kinship. Eigendecomposition defaults to DSYEVD (via numpy) but falls back to DSYEVR (C extension, O(n) workspace) under memory pressure — critical for >100k samples. At LMM it splits into a JAX backend (JIT, vmap, sharding) or a NumPy backend with an optional C extension for OpenMP-parallel Wald tests.

flowchart TD
    CLI["CLI / gwas()"] --> PIPE["PipelineRunner"]
    PIPE --> LOAD["Load PLINK + Phenotypes<br>(NumPy)"]
    LOAD --> KIN["Kinship<br>(NumPy matmul)"]
    KIN --> EIGMEM{"DSYEVD fits<br>in memory?"}
    EIGMEM -->|yes| EIGD["Eigendecomposition<br>(LAPACK DSYEVD · O(n²) workspace)"]
    EIGMEM -->|no| EIGR["Eigendecomposition<br>(LAPACK DSYEVR · O(n) workspace)"]
    EIGD --> DET{"detect_backend()"}
    EIGR --> DET
    DET -->|"jax"| JAX["JAX Streaming Runner<br>JIT + vmap + sharding"]
    DET -->|"numpy"| NP["NumPy Batch Runner"]
    NP --> CEXT{"C LMM extension<br>available?"}
    CEXT -->|yes| C["C Extension<br>OpenMP + SIMD"]
    CEXT -->|no| PY["Pure Python<br>fallback"]
    JAX --> RES["AssocResult"]
    C --> RES
    PY --> RES

Both backends share the same core algorithms (likelihood.py, prepare_common.py) and produce identical results. Backend-specific files follow a naming convention: *_jax.py / *_numpy.py.

See Code Map for the full architecture diagram with source links.

Documentation

Requirements

  • Python 3.11+
  • NumPy 2.0+
  • JAX 0.5.0+ (auto-included on Linux/ARM Mac; explicit extra on other platforms: pip install 'jamma[jax]')

License

GPL-3.0 (same as GEMMA)

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jamma-3.3.2.tar.gz (83.5 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

jamma-3.3.2-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (478.8 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

jamma-3.3.2-cp313-cp313-macosx_11_0_arm64.whl (333.5 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

jamma-3.3.2-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (478.8 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

jamma-3.3.2-cp312-cp312-macosx_11_0_arm64.whl (333.5 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

jamma-3.3.2-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (477.9 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

jamma-3.3.2-cp311-cp311-macosx_11_0_arm64.whl (333.6 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

File details

Details for the file jamma-3.3.2.tar.gz.

File metadata

  • Download URL: jamma-3.3.2.tar.gz
  • Upload date:
  • Size: 83.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for jamma-3.3.2.tar.gz
Algorithm Hash digest
SHA256 3bcc2763b0d282d6a59e719b038e2e20aa75766fe90e20bbbcbe2ce22380ee22
MD5 93cd26c8cc4ad5bc48ec6eee7311b63f
BLAKE2b-256 60477671d945038ba5deea6585bd09e303ca099dea489fcb9e6c133f54e87cfc

See more details on using hashes here.

Provenance

The following attestation bundles were made for jamma-3.3.2.tar.gz:

Publisher: build-wheels.yml on michael-denyer/jamma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file jamma-3.3.2-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for jamma-3.3.2-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 2612ccadf999cee06db4559e4cdc5bc346b6e8af5bc79904dd89090e730afd1c
MD5 43ffbc7d1c7ef9738b3e7239b6919799
BLAKE2b-256 55bd3612b0242eb79b97bf15ba8318b20620989a36f9c5ff758786e3385976f5

See more details on using hashes here.

Provenance

The following attestation bundles were made for jamma-3.3.2-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl:

Publisher: build-wheels.yml on michael-denyer/jamma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file jamma-3.3.2-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for jamma-3.3.2-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 97798c2edc6ff41a23724a52543ff4f85f100c58fbd16cf5e797b64a43197d47
MD5 b68d57f74a5ff7217111cb2c5f496198
BLAKE2b-256 d65a59fdc1b1ced0e2172e2dfd1213a29ecfc75b57fbf0bd71f9626d4587e052

See more details on using hashes here.

Provenance

The following attestation bundles were made for jamma-3.3.2-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: build-wheels.yml on michael-denyer/jamma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file jamma-3.3.2-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for jamma-3.3.2-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 bf02030195b7bb016775d68541eb671e28c8abe940c196ad90c8f4755f953523
MD5 f2b0f421505a9e292bb6740855c2032d
BLAKE2b-256 68d5b2055e21ad230869e8fe4d2c2644ec460c4a8e1bf7d853708066e2b52333

See more details on using hashes here.

Provenance

The following attestation bundles were made for jamma-3.3.2-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl:

Publisher: build-wheels.yml on michael-denyer/jamma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file jamma-3.3.2-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for jamma-3.3.2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 16781fbc0fb50932696b833b83de9fa2b288a8aee764979fda82cd1b960a0e39
MD5 e4b446fc99cec3be05c41ff04ef7d4f5
BLAKE2b-256 5c613930c3ff1c409b87659785826eb54315f9114106cda9056851459f4d4ce8

See more details on using hashes here.

Provenance

The following attestation bundles were made for jamma-3.3.2-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: build-wheels.yml on michael-denyer/jamma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file jamma-3.3.2-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for jamma-3.3.2-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 286c2e2ca2df870a071d4575957fae10d956767bed6c965b9c34fa13078541f6
MD5 7c8f91745b53142f71279499d7873434
BLAKE2b-256 213e0b07866065789b879d58ac6ecb6074bb6bf5b589ca8a0d9f4e34d8f2d934

See more details on using hashes here.

Provenance

The following attestation bundles were made for jamma-3.3.2-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl:

Publisher: build-wheels.yml on michael-denyer/jamma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file jamma-3.3.2-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for jamma-3.3.2-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f4c4348fd3d11a1c4b69b56d8ac2fd00651c309e08d8efd3d4f4a9df73787e36
MD5 bad3cfac8dee50cf744c14243812e01b
BLAKE2b-256 6760ad3524d64a466c15b603f1b011a1d93afb9483805cafc705ac9537c096d8

See more details on using hashes here.

Provenance

The following attestation bundles were made for jamma-3.3.2-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: build-wheels.yml on michael-denyer/jamma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page