Skip to main content

A Suite of Genotyping Tools for Genome-Wide Association Study and Genomic Selection

Project description

JanusX

简体中文(推荐) | English | 算法分享

Project Overview

JanusX is a high-performance, ALL-in-ONE suite for quantitative genetics that unifies genome-wide association studies (GWAS) and genomic selection (GS). It incorporates well-established GWAS methods (LM, LMM, and FarmCPU) and a flexible GS toolkit including GBLUP and various machine learning models. It also combines routine genomic analyses, from data processing to publication-ready visualisation.

It provides significant performance improvements over tools like GEMMA, GCTA, and rMVP, especially in multi-threaded computation.

Installation

PyPI

pip install janusx

From Source

git clone https://github.com/MaizeMan-JxFU/JanusX.git
cd JanusX
pip install .

Building from source requires a Rust toolchain (maturin will compile the native core).

Pre-compiled Releases

We provide pre-compiled binaries on the GitHub Releases page for Windows, Linux, and macOS. Download and extract the archive, then run the executable directly.

Running the CLI

jx -h
jx <module> [options]

Note that running jx -h might take a while at first! This is because the Python interpreter is compiling source code into the pycache directory. Subsequent runs will use the pre-compiled code and load much faster!

Available Modules

Module Description
gwas Unified GWAS wrapper (LM/LMM/fastLMM/FarmCPU)
gs Genomic Selection (GBLUP, rrBLUP)
postGWAS Visualization and annotation
grm Genetic relationship matrix calculation
pca Principal component analysis
sim Genotype and phenotype simulation

Quick Start Examples

GWAS Analysis

# Using unified gwas module (select one or more models)
jx gwas --vcf data.vcf.gz --pheno pheno.txt --lmm -o results

# Run multiple models at once
jx gwas --vcf data.vcf.gz --pheno pheno.txt --lm --lmm --fastlmm --farmcpu -o results

# With PLINK format
jx gwas --bfile genotypes --pheno phenotypes.txt --grm 1 --qcov 3 --thread 8 -o results

# With diagnostic plots (SVG)
jx gwas --vcf data.vcf.gz --pheno pheno.txt --lmm --plot -o results

Genomic Selection

# Run both GS models
jx gs --vcf data.vcf.gz --pheno pheno.txt --GBLUP --rrBLUP -o results

# Specific models
jx gs --vcf data.vcf.gz --pheno pheno.txt --GBLUP -o results

# With PCA-based dimensionality reduction
jx gs --vcf data.vcf.gz --pheno pheno.txt --GBLUP --pcd -o results

Visualization

# Generate Manhattan and QQ plots
jx postGWAS -f results/*.lmm.tsv --threshold 1e-6

# With SNP annotation
jx postGWAS -f results/*.lmm.tsv --threshold 1e-6 -a annotation.gff --annobroaden 50

manhanden&qq

Test data in example is from genetics-statistics/GEMMA, published in Parker et al, Nature Genetics, 2016

Population Structure

# Compute GRM
jx grm --vcf data.vcf.gz -o results

# PCA analysis
jx pca --vcf data.vcf.gz --dim 5 --plot --plot3D -o results

Input File Formats

Phenotype File

Tab-delimited, first column is sample ID, subsequent columns are phenotypes:

samples trait1 trait2
indv1 10.5 0.85
indv2 12.3 0.92

Genotype Files

  • VCF: .vcf or .vcf.gz
  • PLINK: .bed/.bim/.fam (use prefix)

Architecture

Core Libraries

  • python/janusx/pyBLUP - Core statistical engine

    • GWAS implementations (LM, LMM, FarmCPU)
    • QK matrix calculation with memory-optimized chunking
    • PCA computation with randomized SVD
    • Cross-validation utilities
  • python/janusx/gfreader - Genotype file I/O

    • VCF reader
    • PLINK binary reader (.bed/.bim/.fam)
    • NumPy format support
  • python/janusx/bioplotkit - Visualization

    • Manhattan and QQ plots
    • PCA visualization (2D and 3D GIF)
    • LD block visualization

Native Core (src/)

Rust kernels for fast linear algebra and association testing.

CLI Entry Points (python/janusx/script/)

Each module corresponds to a CLI command. The launcher script (jx) dispatches to script/<name>.py.

Key Features

  • Two Core Functions: Unified GWAS and GS workflows in one tool
  • Easy to Use: Simple CLI interface, minimal configuration required
  • High Performance: Optimized LMM computation with multi-threading

Key Algorithms

GWAS Methods

Method Description Best For
Linear Model (LM) Standard GLM for association testing Large datasets without population structure
Linear Mixed Model (LMM) Incorporates kinship matrix to control population structure Most GWAS scenarios
fastLMM Fixed-lambda mixed model for speed Fast approximate screening
FarmCPU Iterative fixed/random effect alternation High power with strict false positive control

GS Methods

Method Description Best For
GBLUP Genomic Best Linear Unbiased Prediction Baseline prediction
rrBLUP Ridge Regression BLUP Additive genetic value estimation

Kinship Methods

  • Method 1 (VanRaden): Centered GRM (default)
  • Method 2 (Yang): Standardized/weighted GRM

Python Version

Requires Python 3.10+

Test Data

Example data in example/ directory from Parker et al, Nature Genetics, 2016 (via GEMMA project)

Citation

@software{JanusX,
  title = {JanusX: High-performance GWAS and Genomic Selection Suite},
  author = {Jingxian FU},
  url = {https://github.com/MaizeMan-JxFU/JanusX}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

janusx-1.0.6.tar.gz (2.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

janusx-1.0.6-cp312-cp312-manylinux_2_34_x86_64.whl (598.3 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

File details

Details for the file janusx-1.0.6.tar.gz.

File metadata

  • Download URL: janusx-1.0.6.tar.gz
  • Upload date:
  • Size: 2.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.9.4

File hashes

Hashes for janusx-1.0.6.tar.gz
Algorithm Hash digest
SHA256 3d8142ef1e607c27e6203b72b38d238f558842dc842b519c76fd679a99937a7c
MD5 09e182e1a7567992041b355226b3b4fe
BLAKE2b-256 65afcfc2aeb9b6fbc05ab7a1dd349cae318fa73318a2e79ce15fec52046ef417

See more details on using hashes here.

File details

Details for the file janusx-1.0.6-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for janusx-1.0.6-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 7b72f8bba20eae32c42106b7717dfaec25f87e7c70691272574782231d5fe817
MD5 6210c45d34ef2176513630784245533d
BLAKE2b-256 3f68e2fff3ef5fdc4832f1cd8ec8c889921edd51adfda9e2ffb658b94ac7bb2d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page