ADAMIXTURE: Adaptive First-Order Optimization for Biobank-Scale Ancestry Inference
Project description
Fast Biobank-Scale Population Genetics Clustering
ADAMIXTURE is a fast CPU/GPU implementation of ADMIXTURE for biobank-scale genetic clustering. .P and .Q outputs remain compatible with ADMIXTURE.
System requirements
Hardware requirements
The successful usage of this package requires a computer with enough RAM to be able to handle the large datasets the network has been designed to work with. Due to this, we recommend using compute clusters whenever available to avoid memory issues.
Software requirements
We recommend creating a fresh Python 3.10+ virtual environment. For a faster installation experience, we highly recommend using uv.
[!IMPORTANT]
If you plan to use GPU acceleration, ensure that the CUDA toolkit is correctly loaded (e.g.,module load cuda) before starting the installation. This ensures that the dependencies and internal components are correctly configured for your hardware.
As an example, using uv (recommended):
$ uv venv --python 3.10
$ source .venv/bin/activate
$ uv pip install adamixture
Installation Guide
The package can be easily installed in at most a few minutes using pip (make sure to add the --upgrade flag if updating the version):
$ pip install adamixture
Running ADAMIXTURE
To train a model, simply invoke the following commands from the root directory of the project. For more info about all the arguments, please run adamixture --help. Note that BED, VCF and PGEN are supported.
As an example, the following ADMIXTURE call
$ ./admixture snps_data.bed 8 -s 42
would be equivalent in ADAMIXTURE by running
$ adamixture -k 8 --data_path snps_data.bed --save_dir SAVE_PATH --name snps_data -s 42
By default, the following files will be output to the SAVE_PATH directory (the name parameter will be used to create the full filenames):
- A
.Pfile, similar to ADMIXTURE. - A
.Qfile, similar to ADMIXTURE. - A
.pngplot file containing the visualization of the inferred ancestry proportions (Q matrix).
Logs are printed to the stdout channel by default. If you want to save them to a file, you can use the command tee along with a pipe:
$ adamixture -k 8 ... | tee run.log
Running with multi-threading
To run ADAMIXTURE using multiple CPU threads, use the -t flag:
$ adamixture -k 8 --data_path data.bed --save_dir out/ --name test -t 8
Running with GPU acceleration
To leverage GPU acceleration (highly recommended for large datasets), use the --device flag:
- NVIDIA GPU (CUDA):
$ adamixture -k 8 --data_path data.bed --save_dir out/ --name test --device gpu
- macOS Apple Silicon (MPS):
$ adamixture -k 8 --data_path data.bed --save_dir out/ --name test --device mps
[!TIP] GPU Acceleration: Using GPUs greatly speeds up processing and is highly recommended for large datasets. You can specify the hardware to use with the
--deviceparameter:
- For NVIDIA GPUs, use
--device gpu(requires CUDA).- For macOS users with Apple Silicon (M1/M2/M3/M4/M5), use
--device mpsto enable Metal Performance Shaders (MPS) acceleration.- Note that biobank-scale datasets are best handled on dedicated CUDA-capable GPUs due to high RAM requirements.
Multi-K Sweep
Instead of running ADAMIXTURE for a single K, you can automatically sweep over a range of K values using --min_k and --max_k. The data is loaded once, and each K is trained sequentially:
$ adamixture --min_k 2 --max_k 10 --data_path snps_data.bed --save_dir SAVE_PATH --name snps_sweep
Cross-validation
Use --cv to estimate the optimal K by masking a fraction of genotype entries and measuring prediction error. → Full documentation
$ adamixture -k 8 --cv --data_path data.bed --save_dir out/ --name test
Plotting
By default, ADAMIXTURE automatically generates a png plot at 300 DPI without needing any additional flags. → Full documentation
Plots can include hierarchical population labels if you provide the arguments (--labels, --labels2, --labels3).
If you want to customize the format and resolution (e.g., to generate a PDF), you must use the appropriate flag depending on your execution mode:
-
Single K runs (
-k): Use--plot_single. Note that--plotwill be ignored in single K mode.$ adamixture -k 8 --data_path data.bed --save_dir out/ --name test --plot_single pdf 300
-
Multi-K sweeps (
--min_kand--max_k): Use--plotto configure the combined sweep plot.$ adamixture --min_k 2 --max_k 10 --data_path data.bed --save_dir out/ --name test --plot pdf 300
Projection Mode
Estimate ancestry proportions for new samples using a pre-trained, fixed P matrix (Q-only optimisation). K is detected automatically from P. → Full documentation
$ adamixture-project \
--data_path new_samples.bed \
--p_path trained_model/results.8.P \
--save_dir projection_out/ \
--name projected
Supervised Mode
Anchor the model with known population labels for a subset of samples while estimating Q freely for unlabeled ones. Labels use the same format as --labels (population name or -). → Full documentation
$ adamixture-supervised \
--data_path all_samples.bed \
--labels labels.txt \
--save_dir supervised_out/ \
--name supervised_run \
-k 8
Other options
All hyperparameters and flags can be explored with:
$ adamixture --help
Key arguments:
| Argument | Default | Description |
|---|---|---|
--init |
als |
Initialization method: improved SVD+ALS (als) or random EM priming (em) |
--tol |
0.1 |
Convergence tolerance for log-likelihood changes |
--max_iter |
10000 |
Maximum optimization iterations |
-t |
1 |
Number of CPU threads |
-s |
42 |
Random seed |
--device |
cpu |
Device to use: cpu, gpu, or mps |
--chunk_size |
8192 |
Number of SNPs in chunk operations |
--chromosome_mode |
autosomes |
Chromosome filter: autosomes keeps autosomes 1..--autosome_count; all keeps every chromosome |
--autosome_count |
22 |
Number of autosomes kept when --chromosome_mode autosomes |
--no_freqs |
False |
Do not save the .P allele-frequency matrix |
Algorithm note
The ADAMIXTURE preprint introduced Adam-EM as an adaptive first-order optimizer for admixture inference. The package still includes this solver via --algorithm adamem.
In the current implementation, the default is --algorithm brqn. Empirical benchmarking showed that block relaxation with ZAL quasi-Newton acceleration, when paired with our improved SVD+ALS initialization, reaches high-quality solutions in fewer iterations and better wall-clock time. For that reason, BR-QN is the default solver, while Adam-EM remains available for experimentation and reproducibility. Adam-EM tuning parameters are documented in Troubleshooting and Tips.
Troubleshooting and Tips
License
This project is licensed under the BSD 3-Clause License - see the LICENSE file for details.
Cite
When using this software, please cite the following preprint:
@article{saurina2026adamixture,
title={ADAMIXTURE: Adaptive First-Order Optimization for Biobank-Scale Genetic Clustering},
author={Saurina-i-Ricos, Joan and Mas Monserrat, Daniel and Ioannidis, Alexander G.},
journal={bioRxiv},
year={2026},
doi={10.64898/2026.02.13.700171},
url={https://doi.org/10.64898/2026.02.13.700171}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file adamixture-1.7.4.tar.gz.
File metadata
- Download URL: adamixture-1.7.4.tar.gz
- Upload date:
- Size: 7.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f7f77db5b7dd74e28794133fe69b63a0602e5a58002a7c7c08536f7d2ebaec76
|
|
| MD5 |
43b660cc02f715627f677252ce1bde14
|
|
| BLAKE2b-256 |
8524a810d2408e6d921bed04a239d04a7d038906e5c3afdc67501ef4552e338b
|
File details
Details for the file adamixture-1.7.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: adamixture-1.7.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 1.8 MB
- Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f6acc9974337ac61bd9b5ae4994c690cc3dcadc0d67e92d3338156083e24cc00
|
|
| MD5 |
b063684af2382623b52e2bb905c8a063
|
|
| BLAKE2b-256 |
5f784874949d2072080c2129a0b64ba2c768cb434e4505bc8f45ca2ddb6f7c61
|
File details
Details for the file adamixture-1.7.4-cp312-cp312-macosx_14_0_x86_64.whl.
File metadata
- Download URL: adamixture-1.7.4-cp312-cp312-macosx_14_0_x86_64.whl
- Upload date:
- Size: 1.6 MB
- Tags: CPython 3.12, macOS 14.0+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0c37e18097e319e73bcb5827f2ecbb9439467c09818ecc688573be5789644fbe
|
|
| MD5 |
90a8260bd4efbd3e79843483823bd23b
|
|
| BLAKE2b-256 |
4452fea85a6c75d1da510461c8d9f1cb55c89be6ebfe4942c8c5638043977a1d
|
File details
Details for the file adamixture-1.7.4-cp312-cp312-macosx_14_0_arm64.whl.
File metadata
- Download URL: adamixture-1.7.4-cp312-cp312-macosx_14_0_arm64.whl
- Upload date:
- Size: 1.8 MB
- Tags: CPython 3.12, macOS 14.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f8c8ab1240a8c52f825df7a2b2297c3ff246019cdf13f70833bcec0208b996be
|
|
| MD5 |
279ad4de89cafb971721dbf7aaad8317
|
|
| BLAKE2b-256 |
81bd56295ea369afda8080043ad626f8ca394f2a6bfcf79fc9b13ca8abfedffe
|
File details
Details for the file adamixture-1.7.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: adamixture-1.7.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 1.8 MB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3cc6096dd31782d0583c3bf88ec98439d466abc373e809c7742d56b20eacfed3
|
|
| MD5 |
d494d883b830a787551371f817dbaca7
|
|
| BLAKE2b-256 |
5d86f9db0ff758a1c278661934720f44ee37357e078f688449540192cb457ea8
|
File details
Details for the file adamixture-1.7.4-cp311-cp311-macosx_14_0_x86_64.whl.
File metadata
- Download URL: adamixture-1.7.4-cp311-cp311-macosx_14_0_x86_64.whl
- Upload date:
- Size: 1.6 MB
- Tags: CPython 3.11, macOS 14.0+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4c10516c3c6b22ac8d9b2befdaf763116bb0a72dde37feababe886cacd3fd62b
|
|
| MD5 |
279a294722e5ffa62cd1c22b0accc7c1
|
|
| BLAKE2b-256 |
fc8a2e4d80f116369c71add7c95424c22277bd4e0d51d2b8c8a28ff08f8c38e1
|
File details
Details for the file adamixture-1.7.4-cp311-cp311-macosx_14_0_arm64.whl.
File metadata
- Download URL: adamixture-1.7.4-cp311-cp311-macosx_14_0_arm64.whl
- Upload date:
- Size: 1.8 MB
- Tags: CPython 3.11, macOS 14.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0a5b44092b15a22d6deab02e73d30d631fa715386603e5b1ce2bbeeae11a01d8
|
|
| MD5 |
399bf5e97507422a4a6a948adcf69db1
|
|
| BLAKE2b-256 |
c98ffc9a9afd821a74ae7535718e462eebc124df6d6db91ae7391cfcd56b0952
|
File details
Details for the file adamixture-1.7.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: adamixture-1.7.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 1.8 MB
- Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c6a62a0a7a709c1f1eeae28b0615220b76a6d7f5fce2d705c9686145968ad6e1
|
|
| MD5 |
0e9d02340a009103d24a4779b1582b09
|
|
| BLAKE2b-256 |
1556337923e54382c00c7d44c0a4d29e088bd285483e39063a684bd545d62dde
|
File details
Details for the file adamixture-1.7.4-cp310-cp310-macosx_14_0_x86_64.whl.
File metadata
- Download URL: adamixture-1.7.4-cp310-cp310-macosx_14_0_x86_64.whl
- Upload date:
- Size: 1.6 MB
- Tags: CPython 3.10, macOS 14.0+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7709056e2037180a63547fba291aa2f842c3f1d9ae3aaba24f0c8a5c43a78253
|
|
| MD5 |
da22e2c121da0f47382dd009456ce9e8
|
|
| BLAKE2b-256 |
b6c073f105b5840bb3cfc1735c4cf699b8ba1f79479c481e303a292f5592e184
|
File details
Details for the file adamixture-1.7.4-cp310-cp310-macosx_14_0_arm64.whl.
File metadata
- Download URL: adamixture-1.7.4-cp310-cp310-macosx_14_0_arm64.whl
- Upload date:
- Size: 1.8 MB
- Tags: CPython 3.10, macOS 14.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
184245d492b34dae8748ca0a6eecf22f291b916afb8db42745210249c6ce368a
|
|
| MD5 |
503d21153ef708bf892c4da33bea62a8
|
|
| BLAKE2b-256 |
9c664bc5d5e1534027d73f5c6afa8193570f6b9ed4056b71e2e669e665a7db6e
|