Skip to main content

ADAMIXTURE: Adaptive First-Order Optimization for Biobank-Scale Ancestry Inference

Project description

ADAMIXTURE logo

Adaptive First-Order Optimization for Biobank-Scale Genetic Clustering

Python Version PyPI Version License Status Downloads


ADAMIXTURE is an unsupervised global ancestry inference method that scales the ADMIXTURE model to biobank-sized datasets. It combines the Expectation–Maximization (EM) framework with the Adam first-order optimizer, enabling parameter updates after a single EM step. This approach accelerates convergence while maintaining comparable or improved accuracy, substantially reducing runtime on large genotype datasets. For more information, we recommend reading our preprint.

The software can be invoked via CLI and has a similar interface to ADMIXTURE (e.g. the output format is completely interchangeable).

System requirements

Hardware requirements

The successful usage of this package requires a computer with enough RAM to be able to handle the large datasets the network has been designed to work with. Due to this, we recommend using compute clusters whenever available to avoid memory issues.

Software requirements

We recommend creating a fresh Python 3.10+ virtual environment. For a faster installation experience, we highly recommend using uv.

[!IMPORTANT]
If you plan to use GPU acceleration, ensure that the CUDA toolkit is correctly loaded (e.g., module load cuda) before starting the installation. This ensures that the dependencies and internal components are correctly configured for your hardware.

As an example, using uv (recommended):

$ uv venv --python 3.10
$ source .venv/bin/activate
$ uv pip install adamixture

Installation Guide

The package can be easily installed in at most a few minutes using pip (make sure to add the --upgrade flag if updating the version):

$ pip install adamixture

Running ADAMIXTURE

To train a model, simply invoke the following commands from the root directory of the project. For more info about all the arguments, please run adamixture --help. Note that BED, VCF and PGEN are supported:

As an example, the following ADMIXTURE call

$ ./admixture snps_data.bed 8 -s 42

would be equivalent in ADAMIXTURE by running

$ adamixture -k 8 --data_path snps_data.bed --save_dir SAVE_PATH --name snps_data -s 42

Two files will be output to the SAVE_PATH directory (the name parameter will be used to create the full filenames):

  • A .P file, similar to ADMIXTURE.
  • A .Q file, similar to ADMIXTURE.

Logs are printed to the stdout channel by default. If you want to save them to a file, you can use the command tee along with a pipe:

$ adamixture -k 8 ... | tee run.log

Running with multi-threading

To run ADAMIXTURE using multiple CPU threads, use the -t flag:

$ adamixture -k 8 --data_path data.bed --save_dir out/ --name test -t 8

Running with GPU acceleration

To leverage GPU acceleration (highly recommended for large datasets), use the --device flag:

  • NVIDIA GPU (CUDA):
    $ adamixture -k 8 --data_path data.bed --save_dir out/ --name test --device gpu
    
  • macOS Apple Silicon (MPS):
    $ adamixture -k 8 --data_path data.bed --save_dir out/ --name test --device mps
    

[!TIP] GPU Acceleration: Using GPUs greatly speeds up processing and is highly recommended for large datasets. You can specify the hardware to use with the --device parameter:

  • For NVIDIA GPUs, use --device gpu (requires CUDA).
  • For macOS users with Apple Silicon (M1/M2/M3/M4/M5), use --device mps to enable Metal Performance Shaders (MPS) acceleration.
  • Note that biobank-scale datasets are best handled on dedicated CUDA-capable GPUs due to high RAM requirements.

Multi-K Sweep

Instead of running ADAMIXTURE for a single K, you can automatically sweep over a range of K values using --min_k and --max_k. The data is loaded once, and each K is trained sequentially:

$ adamixture --min_k 2 --max_k 10 --data_path snps_data.bed --save_dir SAVE_PATH --name snps_sweep

Cross-validation

Use --cv to estimate the optimal K by masking a fraction of genotype entries and measuring prediction error. → Full documentation

$ adamixture -k 8 --cv --data_path data.bed --save_dir out/ --name test

Plotting

Native high-quality visualizations with hierarchical population labels (--labels, --labels2, --labels3) and multi-run alignment. → Full documentation

$ adamixture -k 8 --data_path data.bed --save_dir out/ --name test --plot pdf 300

Projection Mode

Estimate ancestry proportions for new samples using a pre-trained, fixed P matrix (Q-only optimisation). K is detected automatically from P. → Full documentation

$ adamixture-project \
    --data_path new_samples.bed \
    --p_path trained_model/results.8.P \
    --save_dir projection_out/ \
    --name projected

Supervised Mode

Anchor the model with known population labels for a subset of samples while estimating Q freely for unlabeled ones. Labels use the same format as --labels (population name or -). → Full documentation

$ adamixture-supervised \
    --data_path all_samples.bed \
    --labels labels.txt \
    --save_dir supervised_out/ \
    --name supervised_run \
    -k 8

Other options

All hyperparameters and flags can be explored with:

$ adamixture --help

Key optimizer arguments:

Argument Default Description
--lr 0.005 Adam learning rate
--beta1 0.80 Adam β₁
--beta2 0.88 Adam β₂
--reg_adam 1e-8 Adam ε (numerical stability)
--lr_decay 0.5 Learning rate decay factor
--min_lr 1e-4 Minimum learning rate
--patience_adam 3 Checks without improvement before decaying lr
--tol_adam 0.1 Convergence tolerance
--max_iter 10000 Maximum Adam-EM iterations
--check 5 Log-likelihood evaluation frequency
-t 1 Number of CPU threads
-s 42 Random seed
--chunk_size 4096 Number of SNPs in chunk operations

Troubleshooting and Tips

Full documentation

License

This project is licensed under the BSD 3-Clause License - see the LICENSE file for details.

Cite

When using this software, please cite the following preprint:

@article{saurina2026adamixture,
  title={ADAMIXTURE: Adaptive First-Order Optimization for Biobank-Scale Genetic Clustering},
  author={Saurina-i-Ricos, Joan and Mas Monserrat, Daniel and Ioannidis, Alexander G.},
  journal={bioRxiv},
  year={2026},
  doi={10.64898/2026.02.13.700171},
  url={https://doi.org/10.64898/2026.02.13.700171}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

adamixture-1.6.0.tar.gz (7.8 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

adamixture-1.6.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

adamixture-1.6.0-cp312-cp312-macosx_14_0_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.12macOS 14.0+ x86-64

adamixture-1.6.0-cp312-cp312-macosx_14_0_arm64.whl (1.5 MB view details)

Uploaded CPython 3.12macOS 14.0+ ARM64

adamixture-1.6.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

adamixture-1.6.0-cp311-cp311-macosx_14_0_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.11macOS 14.0+ x86-64

adamixture-1.6.0-cp311-cp311-macosx_14_0_arm64.whl (1.5 MB view details)

Uploaded CPython 3.11macOS 14.0+ ARM64

adamixture-1.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

adamixture-1.6.0-cp310-cp310-macosx_14_0_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.10macOS 14.0+ x86-64

adamixture-1.6.0-cp310-cp310-macosx_14_0_arm64.whl (1.5 MB view details)

Uploaded CPython 3.10macOS 14.0+ ARM64

File details

Details for the file adamixture-1.6.0.tar.gz.

File metadata

  • Download URL: adamixture-1.6.0.tar.gz
  • Upload date:
  • Size: 7.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for adamixture-1.6.0.tar.gz
Algorithm Hash digest
SHA256 90c4b031e2a46fa5a93b474813bd2dd316ace815582810f0265556c0f7715788
MD5 606dca84d59cff31158e4fbf002afa7b
BLAKE2b-256 12ec8fa3b614d9521ebfb90cc6c0e815ce82d2ce1895f1e28af633de05357191

See more details on using hashes here.

File details

Details for the file adamixture-1.6.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.6.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5ffb394bd3c6e1e2be12de1269e1ae380ef637bfcf306770c65fddc6541bcf52
MD5 df1f3a79738dbbba501f9a6f8ad36776
BLAKE2b-256 63d03431230e1efccb54ed204dae97e72890f7b0ae96766496d2df37593f700c

See more details on using hashes here.

File details

Details for the file adamixture-1.6.0-cp312-cp312-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.6.0-cp312-cp312-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 7c30218082fe6ba6ab2afab455d4b0be83dfb46962eb53bdb3d564dc9f7f09d8
MD5 9d1a6ad2fdabc3bb957e550a59800f09
BLAKE2b-256 7a0828f0350d5b69db5d132694f87f0a942f9529be8eec0349fe137912f35c6a

See more details on using hashes here.

File details

Details for the file adamixture-1.6.0-cp312-cp312-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for adamixture-1.6.0-cp312-cp312-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 46bb018f87b9ba1b6c4201f3b064cec705502d389aea56b8415c3968814fa669
MD5 8f0d0947c829fe07970486a57e205964
BLAKE2b-256 57541748bcdc7d58d1d70ef8c8bb6d8e91dfa1529de192202617674a71cade49

See more details on using hashes here.

File details

Details for the file adamixture-1.6.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.6.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 294077b8230c3ffe484b4bbb63fea19ee7cb8033da7d98a7ab7e800bf99d6efe
MD5 7caf2f07bb45de2eeb7332a6b9016fa2
BLAKE2b-256 b80a7674f9d43622f118c1c37848b6aff75f4e6b6782ea6aca72a84f4740a239

See more details on using hashes here.

File details

Details for the file adamixture-1.6.0-cp311-cp311-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.6.0-cp311-cp311-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 0fdbcd964765da2e3c2f3f2ab2bd97dea5a90fd86a08b68f050e574ca81b82e0
MD5 0d2b12999b2bb023f27c3d9ec1ce616a
BLAKE2b-256 e713d547719f15814df67eb5940db52987c5b37ec89bfa66e4a0d8352f633262

See more details on using hashes here.

File details

Details for the file adamixture-1.6.0-cp311-cp311-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for adamixture-1.6.0-cp311-cp311-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 41b26fc4befafea60153f626e4c951aa339851ef13c75aa3daae55fee58bf80f
MD5 8f8d56b8576862a179815555ab2676e7
BLAKE2b-256 f127c379ff818e5ff4e97f7ce7d7e98173d80693d215f20b80e2a04831c0f70b

See more details on using hashes here.

File details

Details for the file adamixture-1.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 401b4f61a8d7038cfede691c5ebce032d9af594c6158818df47d2549751bee37
MD5 7588a0f24701dd0212349ce1a4afa3cd
BLAKE2b-256 080af29dd47ad83039a4e3f4fa315453098b0a89728c97a579c756b74a517316

See more details on using hashes here.

File details

Details for the file adamixture-1.6.0-cp310-cp310-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.6.0-cp310-cp310-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 1e545c57610aecaa1c2196c0f741857d459164a455096c6160c4e7651638acc5
MD5 4143e63184583d879da70f3744eb50ac
BLAKE2b-256 be63a18e181b3a33954cef0a46f200d893e60087ef3908090ca59a041460f12a

See more details on using hashes here.

File details

Details for the file adamixture-1.6.0-cp310-cp310-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for adamixture-1.6.0-cp310-cp310-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 219c4462a803b3c3aeae2b08ad37e4ca9d7b3ebfcf3ea8fc91807401c98ee90a
MD5 6bbdb8defc04372a132fd23db8d4fba4
BLAKE2b-256 017b2ff671f82ab537670b251efe25c16d2923f2794a346dcf858c5cc1344b2a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page