Skip to main content

ADAMIXTURE: Adaptive First-Order Optimization for Biobank-Scale Ancestry Inference

Project description

ADAMIXTURE logo

Fast Biobank-Scale Population Genetics Clustering

Python Version PyPI Version License Status Downloads


ADAMIXTURE is a fast CPU/GPU implementation of ADMIXTURE for biobank-scale genetic clustering. .P and .Q outputs remain compatible with ADMIXTURE.

System requirements

Hardware requirements

The successful usage of this package requires a computer with enough RAM to be able to handle the large datasets the network has been designed to work with. Due to this, we recommend using compute clusters whenever available to avoid memory issues.

Software requirements

We recommend creating a fresh Python 3.10+ virtual environment. For a faster installation experience, we highly recommend using uv.

[!IMPORTANT]
If you plan to use GPU acceleration, ensure that the CUDA toolkit is correctly loaded (e.g., module load cuda) before starting the installation. This ensures that the dependencies and internal components are correctly configured for your hardware.

As an example, using uv (recommended):

$ uv venv --python 3.10
$ source .venv/bin/activate
$ uv pip install adamixture

Installation Guide

The package can be easily installed in at most a few minutes using pip (make sure to add the --upgrade flag if updating the version):

$ pip install adamixture

Running ADAMIXTURE

To train a model, simply invoke the following commands from the root directory of the project. For more info about all the arguments, please run adamixture --help. Note that BED, VCF and PGEN are supported.

As an example, the following ADMIXTURE call

$ ./admixture snps_data.bed 8 -s 42

would be equivalent in ADAMIXTURE by running

$ adamixture -k 8 --data_path snps_data.bed --save_dir SAVE_PATH --name snps_data -s 42

By default, the following files will be output to the SAVE_PATH directory (the name parameter will be used to create the full filenames):

  • A .P file, similar to ADMIXTURE.
  • A .Q file, similar to ADMIXTURE.
  • A .png plot file containing the visualization of the inferred ancestry proportions (Q matrix).

Logs are printed to the stdout channel by default. If you want to save them to a file, you can use the command tee along with a pipe:

$ adamixture -k 8 ... | tee run.log

Running with multi-threading

To run ADAMIXTURE using multiple CPU threads, use the -t flag:

$ adamixture -k 8 --data_path data.bed --save_dir out/ --name test -t 8

Running with GPU acceleration

To leverage GPU acceleration (highly recommended for large datasets), use the --device flag:

  • NVIDIA GPU (CUDA):
    $ adamixture -k 8 --data_path data.bed --save_dir out/ --name test --device gpu
    
  • macOS Apple Silicon (MPS):
    $ adamixture -k 8 --data_path data.bed --save_dir out/ --name test --device mps
    

[!TIP] GPU Acceleration: Using GPUs greatly speeds up processing and is highly recommended for large datasets. You can specify the hardware to use with the --device parameter:

  • For NVIDIA GPUs, use --device gpu (requires CUDA).
  • For macOS users with Apple Silicon (M1/M2/M3/M4/M5), use --device mps to enable Metal Performance Shaders (MPS) acceleration.
  • Note that biobank-scale datasets are best handled on dedicated CUDA-capable GPUs due to high RAM requirements.

Multi-K Sweep

Instead of running ADAMIXTURE for a single K, you can automatically sweep over a range of K values using --min_k and --max_k. The data is loaded once, and each K is trained sequentially:

$ adamixture --min_k 2 --max_k 10 --data_path snps_data.bed --save_dir SAVE_PATH --name snps_sweep

Cross-validation

Use --cv to estimate the optimal K by masking a fraction of genotype entries and measuring prediction error. → Full documentation

$ adamixture -k 8 --cv --data_path data.bed --save_dir out/ --name test

Plotting

By default, ADAMIXTURE automatically generates a png plot at 300 DPI without needing any additional flags. → Full documentation

Plots can include hierarchical population labels if you provide the arguments (--labels, --labels2, --labels3).

If you want to customize the format and resolution (e.g., to generate a PDF), you must use the appropriate flag depending on your execution mode:

  • Single K runs (-k): Use --plot_single. Note that --plot will be ignored in single K mode.

    $ adamixture -k 8 --data_path data.bed --save_dir out/ --name test --plot_single pdf 300
    
  • Multi-K sweeps (--min_k and --max_k): Use --plot to configure the combined sweep plot.

    $ adamixture --min_k 2 --max_k 10 --data_path data.bed --save_dir out/ --name test --plot pdf 300
    

Projection Mode

Estimate ancestry proportions for new samples using a pre-trained, fixed P matrix (Q-only optimisation). K is detected automatically from P. → Full documentation

$ adamixture-project \
    --data_path new_samples.bed \
    --p_path trained_model/results.8.P \
    --save_dir projection_out/ \
    --name projected

Supervised Mode

Anchor the model with known population labels for a subset of samples while estimating Q freely for unlabeled ones. Labels use the same format as --labels (population name or -). → Full documentation

$ adamixture-supervised \
    --data_path all_samples.bed \
    --labels labels.txt \
    --save_dir supervised_out/ \
    --name supervised_run \
    -k 8

Other options

All hyperparameters and flags can be explored with:

$ adamixture --help

Key arguments:

Argument Default Description
--init als Initialization method: improved SVD+ALS (als) or random EM priming (em)
--tol 0.1 Convergence tolerance for log-likelihood changes
--max_iter 10000 Maximum optimization iterations
-t 1 Number of CPU threads
-s 42 Random seed
--device cpu Device to use: cpu, gpu, or mps
--chunk_size 8192 Number of SNPs in chunk operations
--chromosome_mode autosomes Chromosome filter: autosomes keeps autosomes 1..--autosome_count; all keeps every chromosome
--autosome_count 22 Number of autosomes kept when --chromosome_mode autosomes
--no_freqs False Do not save the .P allele-frequency matrix

Algorithm note

The ADAMIXTURE preprint introduced Adam-EM as an adaptive first-order optimizer for admixture inference. The package still includes this solver via --algorithm adamem.

In the current implementation, the default is --algorithm brqn. Empirical benchmarking showed that block relaxation with ZAL quasi-Newton acceleration, when paired with our improved SVD+ALS initialization, reaches high-quality solutions in fewer iterations and better wall-clock time. For that reason, BR-QN is the default solver, while Adam-EM remains available for experimentation and reproducibility. Adam-EM tuning parameters are documented in Troubleshooting and Tips.

Troubleshooting and Tips

Full documentation

License

This project is licensed under the BSD 3-Clause License - see the LICENSE file for details.

Cite

When using this software, please cite the following preprint:

@article{saurina2026adamixture,
  title={ADAMIXTURE: Adaptive First-Order Optimization for Biobank-Scale Genetic Clustering},
  author={Saurina-i-Ricos, Joan and Mas Monserrat, Daniel and Ioannidis, Alexander G.},
  journal={bioRxiv},
  year={2026},
  doi={10.64898/2026.02.13.700171},
  url={https://doi.org/10.64898/2026.02.13.700171}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

adamixture-1.7.4.tar.gz (7.9 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

adamixture-1.7.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

adamixture-1.7.4-cp312-cp312-macosx_14_0_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.12macOS 14.0+ x86-64

adamixture-1.7.4-cp312-cp312-macosx_14_0_arm64.whl (1.8 MB view details)

Uploaded CPython 3.12macOS 14.0+ ARM64

adamixture-1.7.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

adamixture-1.7.4-cp311-cp311-macosx_14_0_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.11macOS 14.0+ x86-64

adamixture-1.7.4-cp311-cp311-macosx_14_0_arm64.whl (1.8 MB view details)

Uploaded CPython 3.11macOS 14.0+ ARM64

adamixture-1.7.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

adamixture-1.7.4-cp310-cp310-macosx_14_0_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.10macOS 14.0+ x86-64

adamixture-1.7.4-cp310-cp310-macosx_14_0_arm64.whl (1.8 MB view details)

Uploaded CPython 3.10macOS 14.0+ ARM64

File details

Details for the file adamixture-1.7.4.tar.gz.

File metadata

  • Download URL: adamixture-1.7.4.tar.gz
  • Upload date:
  • Size: 7.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for adamixture-1.7.4.tar.gz
Algorithm Hash digest
SHA256 f7f77db5b7dd74e28794133fe69b63a0602e5a58002a7c7c08536f7d2ebaec76
MD5 43b660cc02f715627f677252ce1bde14
BLAKE2b-256 8524a810d2408e6d921bed04a239d04a7d038906e5c3afdc67501ef4552e338b

See more details on using hashes here.

File details

Details for the file adamixture-1.7.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f6acc9974337ac61bd9b5ae4994c690cc3dcadc0d67e92d3338156083e24cc00
MD5 b063684af2382623b52e2bb905c8a063
BLAKE2b-256 5f784874949d2072080c2129a0b64ba2c768cb434e4505bc8f45ca2ddb6f7c61

See more details on using hashes here.

File details

Details for the file adamixture-1.7.4-cp312-cp312-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.4-cp312-cp312-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 0c37e18097e319e73bcb5827f2ecbb9439467c09818ecc688573be5789644fbe
MD5 90a8260bd4efbd3e79843483823bd23b
BLAKE2b-256 4452fea85a6c75d1da510461c8d9f1cb55c89be6ebfe4942c8c5638043977a1d

See more details on using hashes here.

File details

Details for the file adamixture-1.7.4-cp312-cp312-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.4-cp312-cp312-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 f8c8ab1240a8c52f825df7a2b2297c3ff246019cdf13f70833bcec0208b996be
MD5 279ad4de89cafb971721dbf7aaad8317
BLAKE2b-256 81bd56295ea369afda8080043ad626f8ca394f2a6bfcf79fc9b13ca8abfedffe

See more details on using hashes here.

File details

Details for the file adamixture-1.7.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3cc6096dd31782d0583c3bf88ec98439d466abc373e809c7742d56b20eacfed3
MD5 d494d883b830a787551371f817dbaca7
BLAKE2b-256 5d86f9db0ff758a1c278661934720f44ee37357e078f688449540192cb457ea8

See more details on using hashes here.

File details

Details for the file adamixture-1.7.4-cp311-cp311-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.4-cp311-cp311-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 4c10516c3c6b22ac8d9b2befdaf763116bb0a72dde37feababe886cacd3fd62b
MD5 279a294722e5ffa62cd1c22b0accc7c1
BLAKE2b-256 fc8a2e4d80f116369c71add7c95424c22277bd4e0d51d2b8c8a28ff08f8c38e1

See more details on using hashes here.

File details

Details for the file adamixture-1.7.4-cp311-cp311-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.4-cp311-cp311-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 0a5b44092b15a22d6deab02e73d30d631fa715386603e5b1ce2bbeeae11a01d8
MD5 399bf5e97507422a4a6a948adcf69db1
BLAKE2b-256 c98ffc9a9afd821a74ae7535718e462eebc124df6d6db91ae7391cfcd56b0952

See more details on using hashes here.

File details

Details for the file adamixture-1.7.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c6a62a0a7a709c1f1eeae28b0615220b76a6d7f5fce2d705c9686145968ad6e1
MD5 0e9d02340a009103d24a4779b1582b09
BLAKE2b-256 1556337923e54382c00c7d44c0a4d29e088bd285483e39063a684bd545d62dde

See more details on using hashes here.

File details

Details for the file adamixture-1.7.4-cp310-cp310-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.4-cp310-cp310-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 7709056e2037180a63547fba291aa2f842c3f1d9ae3aaba24f0c8a5c43a78253
MD5 da22e2c121da0f47382dd009456ce9e8
BLAKE2b-256 b6c073f105b5840bb3cfc1735c4cf699b8ba1f79479c481e303a292f5592e184

See more details on using hashes here.

File details

Details for the file adamixture-1.7.4-cp310-cp310-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.4-cp310-cp310-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 184245d492b34dae8748ca0a6eecf22f291b916afb8db42745210249c6ce368a
MD5 503d21153ef708bf892c4da33bea62a8
BLAKE2b-256 9c664bc5d5e1534027d73f5c6afa8193570f6b9ed4056b71e2e669e665a7db6e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page