Skip to main content

ADAMIXTURE: Adaptive First-Order Optimization for Biobank-Scale Ancestry Inference

Project description

ADAMIXTURE logo

Fast Biobank-Scale Population Genetics Clustering

Python Version PyPI Version License Status Downloads


ADAMIXTURE is a fast CPU/GPU implementation of ADMIXTURE for biobank-scale genetic clustering. .P and .Q outputs remain compatible with ADMIXTURE.

System requirements

Hardware requirements

The successful usage of this package requires a computer with enough RAM to be able to handle the large datasets the network has been designed to work with. Due to this, we recommend using compute clusters whenever available to avoid memory issues.

Software requirements

We recommend creating a fresh Python 3.10+ virtual environment. For a faster installation experience, we highly recommend using uv.

[!IMPORTANT]
If you plan to use GPU acceleration, ensure that the CUDA toolkit is correctly loaded (e.g., module load cuda) before starting the installation. This ensures that the dependencies and internal components are correctly configured for your hardware.

As an example, using uv (recommended):

$ uv venv --python 3.10
$ source .venv/bin/activate
$ uv pip install adamixture

Installation Guide

The package can be easily installed in at most a few minutes using pip (make sure to add the --upgrade flag if updating the version):

$ pip install adamixture

Running ADAMIXTURE

To train a model, simply invoke the following commands from the root directory of the project. For more info about all the arguments, please run adamixture --help. Note that BED, VCF and PGEN are supported.

As an example, the following ADMIXTURE call

$ ./admixture snps_data.bed 8 -s 42

would be equivalent in ADAMIXTURE by running

$ adamixture -k 8 --data_path snps_data.bed --save_dir SAVE_PATH --name snps_data -s 42

By default, the following files will be output to the SAVE_PATH directory (the name parameter will be used to create the full filenames):

  • A .P file, similar to ADMIXTURE.
  • A .Q file, similar to ADMIXTURE.
  • A .png plot file containing the visualization of the inferred ancestry proportions (Q matrix).

Logs are printed to the stdout channel by default. If you want to save them to a file, you can use the command tee along with a pipe:

$ adamixture -k 8 ... | tee run.log

Running with multi-threading

To run ADAMIXTURE using multiple CPU threads, use the -t flag:

$ adamixture -k 8 --data_path data.bed --save_dir out/ --name test -t 8

Running with GPU acceleration

To leverage GPU acceleration (highly recommended for large datasets), use the --device flag:

  • NVIDIA GPU (CUDA):
    $ adamixture -k 8 --data_path data.bed --save_dir out/ --name test --device gpu
    
  • macOS Apple Silicon (MPS):
    $ adamixture -k 8 --data_path data.bed --save_dir out/ --name test --device mps
    

[!TIP] GPU Acceleration: Using GPUs greatly speeds up processing and is highly recommended for large datasets. You can specify the hardware to use with the --device parameter:

  • For NVIDIA GPUs, use --device gpu (requires CUDA).
  • For macOS users with Apple Silicon (M1/M2/M3/M4/M5), use --device mps to enable Metal Performance Shaders (MPS) acceleration.
  • Note that biobank-scale datasets are best handled on dedicated CUDA-capable GPUs due to high RAM requirements.

Multi-K Sweep

Instead of running ADAMIXTURE for a single K, you can automatically sweep over a range of K values using --min_k and --max_k. The data is loaded once, and each K is trained sequentially:

$ adamixture --min_k 2 --max_k 10 --data_path snps_data.bed --save_dir SAVE_PATH --name snps_sweep

Cross-validation

Use --cv to estimate the optimal K by masking a fraction of genotype entries and measuring prediction error. → Full documentation

$ adamixture -k 8 --cv --data_path data.bed --save_dir out/ --name test

Plotting

By default, ADAMIXTURE automatically generates a png plot at 300 DPI without needing any additional flags. → Full documentation

Plots can include hierarchical population labels if you provide the arguments (--labels, --labels2, --labels3).

If you want to customize the format and resolution (e.g., to generate a PDF), you must use the appropriate flag depending on your execution mode:

  • Single K runs (-k): Use --plot_single. Note that --plot will be ignored in single K mode.

    $ adamixture -k 8 --data_path data.bed --save_dir out/ --name test --plot_single pdf 300
    
  • Multi-K sweeps (--min_k and --max_k): Use --plot to configure the combined sweep plot.

    $ adamixture --min_k 2 --max_k 10 --data_path data.bed --save_dir out/ --name test --plot pdf 300
    

Projection Mode

Estimate ancestry proportions for new samples using a pre-trained, fixed P matrix (Q-only optimisation). K is detected automatically from P. → Full documentation

$ adamixture-project \
    --data_path new_samples.bed \
    --p_path trained_model/results.8.P \
    --save_dir projection_out/ \
    --name projected

Supervised Mode

Anchor the model with known population labels for a subset of samples while estimating Q freely for unlabeled ones. Labels use the same format as --labels (population name or -). → Full documentation

$ adamixture-supervised \
    --data_path all_samples.bed \
    --labels labels.txt \
    --save_dir supervised_out/ \
    --name supervised_run \
    -k 8

Other options

All hyperparameters and flags can be explored with:

$ adamixture --help

Key arguments:

Argument Default Description
--init als Initialization method: improved SVD+ALS (als) or random EM priming (em)
--tol 0.1 Convergence tolerance for log-likelihood changes
--max_iter 10000 Maximum optimization iterations
-t 1 Number of CPU threads
-s 42 Random seed
--device cpu Device to use: cpu, gpu, or mps
--chunk_size 8192 Number of SNPs in chunk operations
--chromosome_mode autosomes Chromosome filter: autosomes keeps autosomes 1..--autosome_count; all keeps every chromosome
--autosome_count 22 Number of autosomes kept when --chromosome_mode autosomes
--no_freqs False Do not save the .P allele-frequency matrix

Algorithm note

The ADAMIXTURE preprint introduced Adam-EM as an adaptive first-order optimizer for admixture inference. The package still includes this solver via --algorithm adamem.

In the current implementation, the default is --algorithm brqn. Empirical benchmarking showed that block relaxation with ZAL quasi-Newton acceleration, when paired with our improved SVD+ALS initialization, reaches high-quality solutions in fewer iterations and better wall-clock time. For that reason, BR-QN is the default solver, while Adam-EM remains available for experimentation and reproducibility. Adam-EM tuning parameters are documented in Troubleshooting and Tips.

Troubleshooting and Tips

Full documentation

License

This project is licensed under the BSD 3-Clause License - see the LICENSE file for details.

Cite

When using this software, please cite the following preprint:

@article{saurina2026adamixture,
  title={ADAMIXTURE: Adaptive First-Order Optimization for Biobank-Scale Genetic Clustering},
  author={Saurina-i-Ricos, Joan and Mas Monserrat, Daniel and Ioannidis, Alexander G.},
  journal={bioRxiv},
  year={2026},
  doi={10.64898/2026.02.13.700171},
  url={https://doi.org/10.64898/2026.02.13.700171}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

adamixture-1.7.1.tar.gz (7.9 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

adamixture-1.7.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

adamixture-1.7.1-cp312-cp312-macosx_14_0_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.12macOS 14.0+ x86-64

adamixture-1.7.1-cp312-cp312-macosx_14_0_arm64.whl (1.8 MB view details)

Uploaded CPython 3.12macOS 14.0+ ARM64

adamixture-1.7.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

adamixture-1.7.1-cp311-cp311-macosx_14_0_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.11macOS 14.0+ x86-64

adamixture-1.7.1-cp311-cp311-macosx_14_0_arm64.whl (1.8 MB view details)

Uploaded CPython 3.11macOS 14.0+ ARM64

adamixture-1.7.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

adamixture-1.7.1-cp310-cp310-macosx_14_0_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.10macOS 14.0+ x86-64

adamixture-1.7.1-cp310-cp310-macosx_14_0_arm64.whl (1.8 MB view details)

Uploaded CPython 3.10macOS 14.0+ ARM64

File details

Details for the file adamixture-1.7.1.tar.gz.

File metadata

  • Download URL: adamixture-1.7.1.tar.gz
  • Upload date:
  • Size: 7.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for adamixture-1.7.1.tar.gz
Algorithm Hash digest
SHA256 e2c5603292bb6907c5f69ebefef29c56a4e6a16db170c17154cce064103dd926
MD5 ec1d9ad567e9df2843549c38dc71dc20
BLAKE2b-256 00030f92ca7742219bdcf648ad23b35ca175fd02dfa7ae6946d4cb90a9982b49

See more details on using hashes here.

File details

Details for the file adamixture-1.7.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 cb68600546a36ff6c0dd5cc33f397bce93f6f989db57e6cf63e9c46a83c177cc
MD5 0cda12234aaa3503332875e3ee81555d
BLAKE2b-256 d876403fd9fa12ff6d709cb63816a30e2d611e404575696a8f287f5d0346ae12

See more details on using hashes here.

File details

Details for the file adamixture-1.7.1-cp312-cp312-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.1-cp312-cp312-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 76d1cbefd7de22a662aeca472a6c925ee84d3724ea3ec31251ee658e46e6cea4
MD5 074ef212583f900c67716b60dd5bb4e9
BLAKE2b-256 9d4c8cb1a28d215209b119acee20e9a86520976cc7a96c856d458c5a28fd9c60

See more details on using hashes here.

File details

Details for the file adamixture-1.7.1-cp312-cp312-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.1-cp312-cp312-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 2fa9d2131243eaa59586209cd4c7e55dae35dc390ba3028cf1915b829fa76c0b
MD5 7173a0fc0d3e2b0bcde0a8d92847525d
BLAKE2b-256 436fcc86dc1880231fa67985b5fd47772610cacf2af57d8d86691fa39a184b44

See more details on using hashes here.

File details

Details for the file adamixture-1.7.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0aaa752bfd45ea7c432c5598f14db6c9ec5ef86576201561862cc0704e4a7b49
MD5 5b5593575b035b3e8fc2f653ffedb4ee
BLAKE2b-256 caa92c3bd738a3917c13e7d044566287d55cd9e70f3dd84a7417052cc585f882

See more details on using hashes here.

File details

Details for the file adamixture-1.7.1-cp311-cp311-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.1-cp311-cp311-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 feac5f6042016dc5f79417ca59ed76fc6d24df9767d6384920a63b1b0157773d
MD5 3e810ed8581b42db2c9969e3bcf99872
BLAKE2b-256 ed2a9145bee5f3073be229ecd6d5e4df335f43adc7c0dda95a051958f90d7e12

See more details on using hashes here.

File details

Details for the file adamixture-1.7.1-cp311-cp311-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.1-cp311-cp311-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 2870bcc8cabd91e81457907a04ca3244245b68b67cb6f271e0fe7a8d2e3c69e7
MD5 071b49b30279cf34a917727b050a5d8a
BLAKE2b-256 6eb4ecd29e587e4fa984c29772c2ce3f5468485a27bc7a366a315152a247df1f

See more details on using hashes here.

File details

Details for the file adamixture-1.7.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 02eb7ce048f82f7b6acbbc4951364c81c69dd6ef3cf35a7b56fced4616bd89f4
MD5 7e2e829eecbf41c08f00404ec5ce7c18
BLAKE2b-256 52e3eda9b00e906aeae05afecc4d0b01fe85d6e9be1646e342a29e1c91f50474

See more details on using hashes here.

File details

Details for the file adamixture-1.7.1-cp310-cp310-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.1-cp310-cp310-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 254da6b696b9f59e36df8036f4ae77456501d2aed0ed7a49fec8caf36fe16644
MD5 de2576af557cf3649d9eb35b01e29fe0
BLAKE2b-256 441ca458f16c683170b0ae9b0aa72919b79042acc1f0d5e92e1181486f1e8eb4

See more details on using hashes here.

File details

Details for the file adamixture-1.7.1-cp310-cp310-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.1-cp310-cp310-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 2241cff4b6d5ece9dd271fbc4c615d83c028d0559d30d926646639188b421171
MD5 edde077f45173cf84682dbaff6625292
BLAKE2b-256 f46c072ca4738fbc39a4b3ec247ddda1124d6181db8269cb1b27b62e0ba8f7c4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page