Skip to main content

ADAMIXTURE: Adaptive First-Order Optimization for Biobank-Scale Ancestry Inference

Project description

ADAMIXTURE logo

Adaptive First-Order Optimization for Biobank-Scale Genetic Clustering

Python Version PyPI Version License Status Downloads


ADAMIXTURE is an unsupervised global ancestry inference method that scales the ADMIXTURE model to biobank-sized datasets. It combines the Expectation–Maximization (EM) framework with the Adam first-order optimizer, enabling parameter updates after a single EM step. This approach accelerates convergence while maintaining comparable or improved accuracy, substantially reducing runtime on large genotype datasets. For more information, we recommend reading our preprint.

The software can be invoked via CLI and has a similar interface to ADMIXTURE (e.g. the output format is completely interchangeable).

System requirements

Hardware requirements

The successful usage of this package requires a computer with enough RAM to be able to handle the large datasets the network has been designed to work with. Due to this, we recommend using compute clusters whenever available to avoid memory issues.

Software requirements

We recommend creating a fresh Python 3.10+ virtual environment. For a faster installation experience, we highly recommend using uv.

[!IMPORTANT]
If you plan to use GPU acceleration, ensure that the CUDA toolkit is correctly loaded (e.g., module load cuda) before starting the installation. This ensures that the dependencies and internal components are correctly configured for your hardware.

As an example, using uv (recommended):

$ uv venv --python 3.10
$ source .venv/bin/activate
$ uv pip install adamixture

Installation Guide

The package can be easily installed in at most a few minutes using pip (make sure to add the --upgrade flag if updating the version):

$ pip install adamixture

Running ADAMIXTURE

To train a model, simply invoke the following commands from the root directory of the project. For more info about all the arguments, please run adamixture --help. Note that BED, VCF and PGEN are supported:

As an example, the following ADMIXTURE call

$ ./admixture snps_data.bed 8 -s 42

would be equivalent in ADAMIXTURE by running

$ adamixture -k 8 --data_path snps_data.bed --save_dir SAVE_PATH --name snps_data -s 42

Two files will be output to the SAVE_PATH directory (the name parameter will be used to create the full filenames):

  • A .P file, similar to ADMIXTURE.
  • A .Q file, similar to ADMIXTURE.

Logs are printed to the stdout channel by default. If you want to save them to a file, you can use the command tee along with a pipe:

$ adamixture -k 8 ... | tee run.log

Running with multi-threading

To run ADAMIXTURE using multiple CPU threads, use the -t flag:

$ adamixture -k 8 --data_path data.bed --save_dir out/ --name test -t 8

Running with GPU acceleration

To leverage GPU acceleration (highly recommended for large datasets), use the --device flag:

  • NVIDIA GPU (CUDA):
    $ adamixture -k 8 --data_path data.bed --save_dir out/ --name test --device gpu
    
  • macOS Apple Silicon (MPS):
    $ adamixture -k 8 --data_path data.bed --save_dir out/ --name test --device mps
    

[!TIP] GPU Acceleration: Using GPUs greatly speeds up processing and is highly recommended for large datasets. You can specify the hardware to use with the --device parameter:

  • For NVIDIA GPUs, use --device gpu (requires CUDA).
  • For macOS users with Apple Silicon (M1/M2/M3/M4/M5), use --device mps to enable Metal Performance Shaders (MPS) acceleration.
  • Note that biobank-scale datasets are best handled on dedicated CUDA-capable GPUs due to high RAM requirements.

Multi-K Sweep

Instead of running ADAMIXTURE for a single K, you can automatically sweep over a range of K values using --min_k and --max_k. The data is loaded once, and each K is trained sequentially:

$ adamixture --min_k 2 --max_k 10 --data_path snps_data.bed --save_dir SAVE_PATH --name snps_sweep

Cross-validation

Use --cv to estimate the optimal K by masking a fraction of genotype entries and measuring prediction error. → Full documentation

$ adamixture -k 8 --cv --data_path data.bed --save_dir out/ --name test

Plotting

Native high-quality visualizations with hierarchical population labels (--labels, --labels2, --labels3) and multi-run alignment. → Full documentation

$ adamixture -k 8 --data_path data.bed --save_dir out/ --name test --plot pdf 300

Projection Mode

Estimate ancestry proportions for new samples using a pre-trained, fixed P matrix (Q-only optimisation). K is detected automatically from P. → Full documentation

$ adamixture-project \
    --data_path new_samples.bed \
    --p_path trained_model/results.8.P \
    --save_dir projection_out/ \
    --name projected

Supervised Mode

Anchor the model with known population labels for a subset of samples while estimating Q freely for unlabeled ones. Labels use the same format as --labels (population name or -). → Full documentation

$ adamixture-supervised \
    --data_path all_samples.bed \
    --labels labels.txt \
    --save_dir supervised_out/ \
    --name supervised_run \
    -k 8

Other options

All hyperparameters and flags can be explored with:

$ adamixture --help

Key optimizer arguments:

Argument Default Description
--lr 0.005 Adam learning rate
--beta1 0.80 Adam β₁
--beta2 0.88 Adam β₂
--reg_adam 1e-8 Adam ε (numerical stability)
--lr_decay 0.5 Learning rate decay factor
--min_lr 1e-4 Minimum learning rate
--patience_adam 3 Checks without improvement before decaying lr
--tol_adam 0.1 Convergence tolerance
--max_iter 10000 Maximum Adam-EM iterations
--check 5 Log-likelihood evaluation frequency
-t 1 Number of CPU threads
-s 42 Random seed
--chunk_size 4096 Number of SNPs in chunk operations

Troubleshooting and Tips

Full documentation

License

This project is licensed under the BSD 3-Clause License - see the LICENSE file for details.

Cite

When using this software, please cite the following preprint:

@article{saurina2026adamixture,
  title={ADAMIXTURE: Adaptive First-Order Optimization for Biobank-Scale Genetic Clustering},
  author={Saurina-i-Ricos, Joan and Mas Monserrat, Daniel and Ioannidis, Alexander G.},
  journal={bioRxiv},
  year={2026},
  doi={10.64898/2026.02.13.700171},
  url={https://doi.org/10.64898/2026.02.13.700171}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

adamixture-1.6.1.tar.gz (7.8 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

adamixture-1.6.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

adamixture-1.6.1-cp312-cp312-macosx_14_0_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.12macOS 14.0+ x86-64

adamixture-1.6.1-cp312-cp312-macosx_14_0_arm64.whl (1.6 MB view details)

Uploaded CPython 3.12macOS 14.0+ ARM64

adamixture-1.6.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

adamixture-1.6.1-cp311-cp311-macosx_14_0_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.11macOS 14.0+ x86-64

adamixture-1.6.1-cp311-cp311-macosx_14_0_arm64.whl (1.5 MB view details)

Uploaded CPython 3.11macOS 14.0+ ARM64

adamixture-1.6.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

adamixture-1.6.1-cp310-cp310-macosx_14_0_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.10macOS 14.0+ x86-64

adamixture-1.6.1-cp310-cp310-macosx_14_0_arm64.whl (1.6 MB view details)

Uploaded CPython 3.10macOS 14.0+ ARM64

File details

Details for the file adamixture-1.6.1.tar.gz.

File metadata

  • Download URL: adamixture-1.6.1.tar.gz
  • Upload date:
  • Size: 7.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for adamixture-1.6.1.tar.gz
Algorithm Hash digest
SHA256 18e817d370f644e0f04b9823d1e98c1fdfde2237c17aa6ad6744d9804ae66c32
MD5 dacf5293e71963ce102d328e939bf2ac
BLAKE2b-256 b3f2f595b8f07aa56f0ef3178a36e3bc20c8d500f202a6a2b01aef2b67b8588c

See more details on using hashes here.

File details

Details for the file adamixture-1.6.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.6.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 af5974056c6db9103a460ca4b2243c9250e805e2859bfa8345c14ef803dc5249
MD5 97a65ce15027ed15a6465f363504c938
BLAKE2b-256 b4f471473da13bd7fd7831b13e421c881b1771c57e74fc46888e28dd9b06d44f

See more details on using hashes here.

File details

Details for the file adamixture-1.6.1-cp312-cp312-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.6.1-cp312-cp312-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 9602ca1ce0745fb41f0bca920ade6862f6e2a195cb566bb4c92c1636020825a2
MD5 983c77edb1ee828986d3adca6ad8ada6
BLAKE2b-256 58c7e6f9c36e3a84862fbe74465c4c063bef2222dc07a64d48d52c0ee5c1ac18

See more details on using hashes here.

File details

Details for the file adamixture-1.6.1-cp312-cp312-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for adamixture-1.6.1-cp312-cp312-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 4910740b2b90bb55ce75e369fce186979f8ac967752978b612ff8a8b9fdd4599
MD5 a540ee3893901cbbcb1870f062ee1976
BLAKE2b-256 ed38b37a511f2f8d2b9cd25a63ad952b06fb64685a9999900a51640ad9bf3791

See more details on using hashes here.

File details

Details for the file adamixture-1.6.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.6.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 913f3dc4a9f723739d70f8cbd93d5123834288cb76ffb1c41d8d3c3c59db4dcf
MD5 788857d188ef8821cc731dd80a45fcad
BLAKE2b-256 e2bf582e431b06da32515ba468b0ef933c99584ab6e2be4a97409b8987085450

See more details on using hashes here.

File details

Details for the file adamixture-1.6.1-cp311-cp311-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.6.1-cp311-cp311-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 aae1231327f56af31a1cd70a0246996e6ce733833d1e05778d885bf20f5e1fcf
MD5 d7b2ee2082d92a1d136c36bb66be8b97
BLAKE2b-256 150a9da0fc5bba84a6f065fea18db75237accb228bf609274d6808c65c85eda1

See more details on using hashes here.

File details

Details for the file adamixture-1.6.1-cp311-cp311-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for adamixture-1.6.1-cp311-cp311-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 69c3ebded3aed61b68eb7907b9c72bcc25fce5ead7e6c784ed2ab90d1453d2e8
MD5 084f1e84988eea2a1b0c29f08fca3087
BLAKE2b-256 285465f3e42984ab3aa99492c172394cd4048d5c9d528f07cbc57d8c32cbf495

See more details on using hashes here.

File details

Details for the file adamixture-1.6.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.6.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 26c0c0704d61c59d4cd81e51e5b40264c0458d0f2543e68cddc65b5f9ff345ae
MD5 df44cebf67a431bf71a4b6e1f7002414
BLAKE2b-256 7f49fa4097d9234fabbb27598a1cdd55aed3d6c725b3e9425fabf721499c8e34

See more details on using hashes here.

File details

Details for the file adamixture-1.6.1-cp310-cp310-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.6.1-cp310-cp310-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 09d239c9640420143185a096f032136f197a2e99fe6f8722ff6af6d0e40fe4af
MD5 be1f5bf4343f0c807da248f96657b73c
BLAKE2b-256 9d37298d9df57cf9428c26153ee468f663987d25e72ee313b003da85dc5eb856

See more details on using hashes here.

File details

Details for the file adamixture-1.6.1-cp310-cp310-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for adamixture-1.6.1-cp310-cp310-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 522f7d5a5a2f80ffa9ea350bfe4deea7666acbca29f340b88dbefd3a7a1c8ce7
MD5 0017cfdb26651681e7351fe9d23230f5
BLAKE2b-256 125a22104aa0628b2862cf157e313d41f8b4f2a6b5d76f6d3c85602c9b9ae857

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page