Skip to main content

Highly-Accelerated Multi-method Mixed-Model Association for large-scale GWAS

Project description

CI PyPI Python 3.11+ NumPy Hypothesis License: GPL-3.0 Buy Me a Coffee

JAMMA

JAMMA (Highly-Accelerated Multi-method Mixed-model Association) -- a modern Python and C reimplementation of GEMMA for large-scale GWAS.

  • Drop-in GEMMA replacement: Same CLI flags, same file formats, same results. Change one word in your pipeline.
  • Numerical equivalence: Validated against GEMMA -- 100% significance agreement, 100% effect direction agreement
  • Fast: Up to 30x faster than GEMMA 0.98.5 (LOCO mode); 12-17x on single-pass LMM
  • Memory-safe: Pre-flight memory checks prevent OOM crashes before allocation
  • Cross-platform: Runs on Linux, macOS, and Windows with NumPy and vendor BLAS
  • Optimized for Intel: Best performance on Intel CPUs with MKL BLAS. Runs well on Apple Silicon (Accelerate BLAS). Other architectures (AMD, ARM Linux) work correctly but with less BLAS optimization
  • Pure Python + C extensions (OpenMP SIMD): NumPy stack with vendor BLAS dispatch (MKL-ILP64, Accelerate-ILP64) via jlinalg C layer for eigendecomposition and OpenMP-parallel Wald tests
  • Large-scale ready: Optional numpy-mkl ILP64 wheels (numpy 2.4.3) for >46k sample eigendecomposition

Installation

macOS (13.3+)

pip install jamma

That's it. macOS Accelerate BLAS handles large matrices natively (Accelerate-ILP64).

Windows (10+), Windows Server (2016+) and Linux (Intel/AMD)

Install numpy-mkl first -- standard numpy uses 32-bit BLAS integers which overflow at ~46k samples. Pre-built ILP64 wheels are available for Python 3.11-3.14:

pip install psutil loguru threadpoolctl click progressbar2 bed-reader
pip install numpy \
  --extra-index-url https://michael-denyer.github.io/numpy-mkl \
  --force-reinstall --upgrade
pip install jamma --no-deps

From Git (latest development version):

pip install psutil loguru threadpoolctl click progressbar2 bed-reader
pip install numpy \
  --extra-index-url https://michael-denyer.github.io/numpy-mkl \
  --force-reinstall --upgrade
pip install git+https://github.com/michael-denyer/jamma.git --no-deps

Why --no-deps? JAMMA depends on numpy>=2.0.0, so a normal pip install jamma will pull in standard numpy and overwrite the ILP64 build. --no-deps prevents this; you install the runtime dependencies manually instead.

See the User Guide for ILP64 verification steps.

Platform Support

Platform BLAS ILP64 Notes
Linux x86_64 MKL (optimal) numpy-mkl Best performance
ARM Linux OpenBLAS -- Works correctly
ARM Mac (M1+) Accelerate native Excellent performance
Intel Mac (macOS 13.3+) Accelerate native Full support
Windows x86_64 (10+) MKL (optimal) numpy-mkl Best performance
Windows Server x86_64 (2016+) MKL (optimal) numpy-mkl Best performance

See the User Guide for BLAS backend details.

Quick Start

# Compute kinship matrix (centered relatedness)
jamma -gk 1 -bfile data/my_study -o output
# Output: output/output.cXX.npy (binary, fast)
# Add --legacy-text for GEMMA-compatible text format

# Run LMM association (Wald test)
jamma -lmm 1 -bfile data/my_study -k output/output.cXX.npy -o results

# Multiple phenotypes (eigendecomp computed once, reused)
jamma -lmm 1 -bfile data/my_study -k output/output.cXX.npy -n "1 2 3" -o results

Output files:

  • output.cXX.npy -- Kinship matrix (binary NumPy format; .cXX.txt with --legacy-text)
  • results.assoc.txt -- Association results (chr, rs, ps, n_miss, allele1, allele0, af, beta, se, logl_H1, l_remle, p_wald)
  • results.log.txt -- Run log

The reader auto-detects format, so existing .cXX.txt files still work as -k input.

GEMMA CLI Parity

JAMMA supports GEMMA's core GWAS flags (-gk, -lmm, -bfile, -k, -c, -o, -n, -loco, -snps, -hwe) with identical names and semantics. Existing GEMMA commands work by changing gemma to jamma:

GEMMA JAMMA
gemma -gk 1 -bfile study -o out jamma -gk 1 -bfile study -o out
gemma -lmm 1 -bfile study -k kinship.cXX.txt -o results jamma -lmm 1 -bfile study -k kinship.cXX.txt -o results
gemma -lmm 4 -bfile study -k k.txt -c covars.txt -o results jamma -lmm 4 -bfile study -k k.txt -c covars.txt -o results
  • Reads and writes GEMMA .assoc.txt and .cXX.txt formats
  • Accepts PLINK binary .bed/.bim/.fam files (same as GEMMA)
  • Output columns match GEMMA (mode-dependent -- see User Guide)
  • Also supports binary .npy format for kinship (faster I/O); use --legacy-text for GEMMA text format

Python API

The gwas() function handles the full pipeline -- data loading, kinship computation, eigendecomposition, and LMM association -- in a single call. You don't need to compute a kinship matrix separately unless you want to reuse it across runs.

from jamma import gwas

# Simplest usage: computes kinship internally, no separate kinship step needed
result = gwas("data/my_study")
print(f"Tested {result.n_snps_tested} SNPs in {result.timing['total_s']:.1f}s")

# Or supply a pre-computed kinship matrix to skip recomputation
result = gwas("data/my_study", kinship_file="data/kinship.cXX.npy")

# Compute kinship from scratch and save it for reuse
result = gwas("data/my_study", save_kinship=True, output_dir="output")

# With covariates and LRT test
result = gwas("data/my_study", kinship_file="k.txt", covariate_file="covars.txt", lmm_mode=2)

# LOCO analysis (leave-one-chromosome-out)
result = gwas("data/my_study", loco=True)

# LOCO with eigen caching (skip eigendecomp on subsequent runs)
result = gwas("data/my_study", loco=True, write_eigen=True)
# Reuse cached eigen files from a previous run
result = gwas("data/my_study", loco=True,
              eigenvalue_file="output/result.eigenD.npy",
              eigenvector_file="output/result.eigenU.npy")

# Multi-phenotype with eigendecomp reuse (Python API)
result = gwas("data/my_study", write_eigen=True, phenotype_column=1)
result = gwas("data/my_study", eigenvalue_file="output/result.eigenD.npy",
              eigenvector_file="output/result.eigenU.npy", phenotype_column=2)
# Or use the CLI for automatic multi-phenotype: jamma -lmm 1 ... -n "1 2 3"

# SNP filtering
result = gwas("data/my_study", kinship_file="k.txt", snps_file="snps.txt", hwe=0.001)

See the User Guide for the low-level component API (kinship, eigendecomposition, LMM runners).

Memory Safety

Unlike GEMMA, JAMMA includes pre-flight memory checks that prevent out-of-memory crashes:

  • Pre-flight checks before large allocations (eigendecomposition, genotype loading)
  • RSS memory logging at workflow boundaries
  • Incremental result writing (no memory accumulation)
  • Safe chunk size defaults with hard caps

GEMMA will silently OOM and get killed by the OS. JAMMA fails fast with clear error messages. See the User Guide for the programmatic memory estimation API.

Performance

Benchmark on mouse_hs1940 (1,940 samples x 12,226 SNPs), Apple M2, GEMMA 0.98.5. Best-of runs, end-to-end wall clock:

Operation GEMMA (OpenBLAS) GEMMA (Accelerate) JAMMA NumPy JAMMA NumPy+C JAMMA NumPy+C (stream) C speedup vs GEMMA (OB) vs GEMMA (Accel)
Kinship (-gk 1) 2.1s 1.7s 262ms 262ms -- 1.0x 8.0x 6.5x
LMM Wald (-lmm 1) 11.0s 7.6s 4.1s 879ms 1.1s 4.7x 12.5x 8.7x
LMM All (-lmm 4) 20.5s 13.9s 6.0s 1.3s 1.4s 4.7x 16.0x 10.9x
LMM Wald+4cov (-lmm 1 -c) 40.8s 18.8s 9.1s 2.4s 2.6s 3.8x 17.0x 7.8x
LOCO Wald (-loco) 3m30s 2m26s -- 7.1s -- -- 29.6x 20.6x

See Performance for benchmark methodology and large-scale (125k) results.

Supported Features

Current

  • Kinship matrix computation -- centered (-gk 1) and standardized (-gk 2)
  • Univariate LMM Wald test (-lmm 1)
  • Likelihood ratio test (-lmm 2)
  • Score test (-lmm 3)
  • All tests mode (-lmm 4)
  • LOCO kinship -- leave-one-chromosome-out analysis (-loco)
  • Binary .npy I/O -- default for kinship and eigen files; --legacy-text for GEMMA text format
  • Multi-phenotype support -- -n "1 2 3" with single eigendecomposition reuse
  • Eigendecomposition reuse -- manual via -d/-u/-eigen, automatic in multi-phenotype mode
  • LOCO eigen caching -- --eigen-dir saves/loads per-chromosome eigen files across runs
  • Phenotype column selection (-n)
  • SNP subset selection for association and kinship (-snps/-ksnps)
  • HWE QC filtering (-hwe)
  • Pre-computed kinship input (-k)
  • Covariate support (-c)
  • PLINK binary format (.bed/.bim/.fam) with input dimension validation
  • Large-scale streaming I/O (>100k samples via numpy-mkl ILP64 -- numpy 2.4.3)
  • Lambda optimization bounds (-lmin/-lmax)
  • Individual weights for kinship (-widv)
  • Categorical covariates with one-hot encoding (-cat)
  • Pre-flight memory checks (fail-fast before OOM)
  • RSS memory logging at workflow boundaries
  • Incremental result writing
  • In-place mean imputation for missing genotypes (per-chunk, zero-copy)
  • Early sample filtering -- kinship accumulated at filtered size when phenotype missingness is present
  • jlinalg C layer: vendor BLAS dispatch for eigendecomposition (DSYEVD default, DSYEVR O(n) workspace fallback under memory pressure), DSYRK, DGEMM
  • Optional C extension: OpenMP-parallel Wald tests (auto-fallback to pure Python)

Planned

  • Multivariate LMM (mvLMM)

Architecture

JAMMA uses NumPy for data loading and kinship. Eigendecomposition uses jlinalg.eigh which dispatches to vendor DSYEVD (default) or DSYEVR (O(n) workspace, under memory pressure) via the jlinalg C layer. LMM association uses a NumPy backend with an optional C extension for OpenMP-parallel Wald/Score/LRT tests. Mode is auto-selected based on available memory: batch runner when genotypes fit in RAM, streaming runner (two-pass disk I/O) for large datasets.

flowchart TD
    subgraph ENTRY["ENTRY"]
        CLI["CLI / gwas()"]
        PIPE["PipelineRunner"]
        CLI --> PIPE
    end

    subgraph IO["DATA LOADING"]
        LOAD["Load PLINK +\nPhenotypes"]
    end

    subgraph CORE["CORE COMPUTATION"]
        KIN["Kinship\n(DGEMM, chunked)"]
        EIG["Eigendecomposition\n(jlinalg.eigh → DSYEVD/DSYEVR)"]
        KIN --> EIG
    end

    subgraph ASSOC["ASSOCIATION TESTING"]
        MEM{"Memory\nbudget?"}
        NP["Batch Runner\n(genotypes in RAM)"]
        NPS["Streaming Runner\n(two-pass disk I/O)"]
        CEXT{"C extension?"}
        C["C Extension\nOpenMP + SIMD"]
        PY["Pure Python\nfallback"]
        MEM -->|"fits"| NP
        MEM -->|"large"| NPS
        NP --> CEXT
        NPS --> CEXT
        CEXT -->|"yes"| C
        CEXT -->|"no"| PY
    end

    RES["AssocResult\n(.assoc.txt)"]

    PIPE --> LOAD --> CORE
    EIG --> ASSOC
    C --> RES
    PY --> RES

    style ENTRY fill:#1a1a2e,stroke:#53a8b6,color:#eee,stroke-width:2px
    style IO fill:#1a1a2e,stroke:#53a8b6,color:#eee,stroke-width:2px
    style CORE fill:#0f3460,stroke:#f5b461,color:#eee,stroke-width:2px
    style ASSOC fill:#0f3460,stroke:#e94560,color:#eee,stroke-width:2px

    style CLI fill:#53a8b6,stroke:#3d8a96,color:#fff
    style PIPE fill:#53a8b6,stroke:#3d8a96,color:#fff
    style LOAD fill:#53a8b6,stroke:#3d8a96,color:#fff

    style KIN fill:#f5b461,stroke:#d4943f,color:#1a1a2e
    style EIG fill:#f5b461,stroke:#d4943f,color:#1a1a2e

    style MEM fill:#e94560,stroke:#c73550,color:#fff
    style NP fill:#7b68ae,stroke:#5a4d8a,color:#fff
    style NPS fill:#7b68ae,stroke:#5a4d8a,color:#fff
    style CEXT fill:#e94560,stroke:#c73550,color:#fff
    style C fill:#2ecc71,stroke:#27ae60,color:#1a1a2e
    style PY fill:#95a5a6,stroke:#7f8c8d,color:#1a1a2e

    style RES fill:#2ecc71,stroke:#27ae60,color:#1a1a2e

Core algorithms (likelihood.py, prepare_common.py) are shared between batch and streaming runners. See jlinalg Architecture for the C vendor BLAS dispatch layer.

See Code Map for the full architecture diagram with source links.

Documentation

Requirements

  • Python 3.11+
  • NumPy 2.0+

License

GPL-3.0 (same as GEMMA)

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jamma-5.1.1.tar.gz (84.0 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

jamma-5.1.1-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (670.5 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

jamma-5.1.1-cp313-cp313-macosx_14_0_arm64.whl (467.4 kB view details)

Uploaded CPython 3.13macOS 14.0+ ARM64

jamma-5.1.1-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (670.6 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

jamma-5.1.1-cp312-cp312-macosx_14_0_arm64.whl (467.4 kB view details)

Uploaded CPython 3.12macOS 14.0+ ARM64

jamma-5.1.1-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (667.1 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

jamma-5.1.1-cp311-cp311-macosx_14_0_arm64.whl (467.7 kB view details)

Uploaded CPython 3.11macOS 14.0+ ARM64

File details

Details for the file jamma-5.1.1.tar.gz.

File metadata

  • Download URL: jamma-5.1.1.tar.gz
  • Upload date:
  • Size: 84.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for jamma-5.1.1.tar.gz
Algorithm Hash digest
SHA256 d80957be871ea7c5d9d8f2fa9b86c07fdb9866552d2aea85a9530a38f5e81d91
MD5 2d43d5feeeebe5d7bf507d6a418d8d56
BLAKE2b-256 b04ae0c2cdce013e99c2e72471e6cee03ff4f367854538a42916280aeaff0383

See more details on using hashes here.

Provenance

The following attestation bundles were made for jamma-5.1.1.tar.gz:

Publisher: build-wheels.yml on michael-denyer/jamma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file jamma-5.1.1-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for jamma-5.1.1-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 9f4cfe3c8c54ef1ca05d4746459059a541b2c1df9bec7f08d2791225f4f8859b
MD5 e8c34d975b72a8a2ac0f670fb3e73548
BLAKE2b-256 f2301317724a87aff883e7299440f691e8800a206f003609e2b1f03c3174b14c

See more details on using hashes here.

Provenance

The following attestation bundles were made for jamma-5.1.1-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl:

Publisher: build-wheels.yml on michael-denyer/jamma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file jamma-5.1.1-cp313-cp313-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for jamma-5.1.1-cp313-cp313-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 581239dd209fc54ac214b93421005cb9e7d4e00913b9b8553006d8bfc8c7a4d8
MD5 1b87859daf61bb5ee06e1331c3b21e75
BLAKE2b-256 d3c5bebe27924a73e61bfff54def9b5b2571d797f12adcbbaddef7e5dba5ddc8

See more details on using hashes here.

Provenance

The following attestation bundles were made for jamma-5.1.1-cp313-cp313-macosx_14_0_arm64.whl:

Publisher: build-wheels.yml on michael-denyer/jamma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file jamma-5.1.1-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for jamma-5.1.1-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 8306d7bd7e24862f8064d5f794f81db37aa20607e6ebd1ce82a3f26529a251e2
MD5 a028e94c05eac312f943f9382b23ee2e
BLAKE2b-256 94e1a9b0d30aa35439ce4b1141a8a05d43629010066b15650f92a1df523f3d40

See more details on using hashes here.

Provenance

The following attestation bundles were made for jamma-5.1.1-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl:

Publisher: build-wheels.yml on michael-denyer/jamma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file jamma-5.1.1-cp312-cp312-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for jamma-5.1.1-cp312-cp312-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 e1086994d1f7e73d64f4b0c45014ca82aa7572d73fb86fff73fa183762c1bada
MD5 18d82cbb29a8bd2ab320443d83212bb8
BLAKE2b-256 9eb1e1efc2275744b483fc85a1050f4e55675d001172f98f3645076c11b8e817

See more details on using hashes here.

Provenance

The following attestation bundles were made for jamma-5.1.1-cp312-cp312-macosx_14_0_arm64.whl:

Publisher: build-wheels.yml on michael-denyer/jamma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file jamma-5.1.1-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for jamma-5.1.1-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 08ddedf9ed2ca4eab8c95916a9f699d26cc5bb25b8afcf7f6fce8841bed9e853
MD5 bc2a5d8c6fb9ad0c70fa7492e565637d
BLAKE2b-256 7c0da7123812f67e18fcc32df99adb6ca7b3d47bccc25a19f1c8df4d2492ece3

See more details on using hashes here.

Provenance

The following attestation bundles were made for jamma-5.1.1-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl:

Publisher: build-wheels.yml on michael-denyer/jamma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file jamma-5.1.1-cp311-cp311-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for jamma-5.1.1-cp311-cp311-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 a0cc4521d2b5962a48add56f4fb175f67b1b9c9fe23e405f0e9cd9a4c797f7ac
MD5 36f246738df744212c2180e38027cf00
BLAKE2b-256 ed3bed10e99e3e9b5173c47b95504eca43ebad5428d9242a47ee0cd36d096268

See more details on using hashes here.

Provenance

The following attestation bundles were made for jamma-5.1.1-cp311-cp311-macosx_14_0_arm64.whl:

Publisher: build-wheels.yml on michael-denyer/jamma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page