Skip to main content

ADAMIXTURE: Adaptive First-Order Optimization for Biobank-Scale Ancestry Inference

Project description

ADAMIXTURE logo

Fast Biobank-Scale Population Genetics Clustering

Python Version PyPI Version License Status Downloads


ADAMIXTURE is a fast CPU/GPU implementation of ADMIXTURE for biobank-scale genetic clustering. .P and .Q outputs remain compatible with ADMIXTURE.

System requirements

Hardware requirements

The successful usage of this package requires a computer with enough RAM to be able to handle the large datasets the network has been designed to work with. Due to this, we recommend using compute clusters whenever available to avoid memory issues.

Software requirements

We recommend creating a fresh Python 3.10+ virtual environment. For a faster installation experience, we highly recommend using uv.

[!IMPORTANT]
If you plan to use GPU acceleration, ensure that the CUDA toolkit is correctly loaded (e.g., module load cuda) before starting the installation. This ensures that the dependencies and internal components are correctly configured for your hardware.

As an example, using uv (recommended):

$ uv venv --python 3.10
$ source .venv/bin/activate
$ uv pip install adamixture

Installation Guide

The package can be easily installed in at most a few minutes using pip (make sure to add the --upgrade flag if updating the version):

$ pip install adamixture

Running ADAMIXTURE

To train a model, simply invoke the following commands from the root directory of the project. For more info about all the arguments, please run adamixture --help. Note that BED, VCF and PGEN are supported.

As an example, the following ADMIXTURE call

$ ./admixture snps_data.bed 8 -s 42

would be equivalent in ADAMIXTURE by running

$ adamixture -k 8 --data_path snps_data.bed --save_dir SAVE_PATH --name snps_data -s 42

By default, the following files will be output to the SAVE_PATH directory (the name parameter will be used to create the full filenames):

  • A .P file, similar to ADMIXTURE.
  • A .Q file, similar to ADMIXTURE.
  • A .png plot file containing the visualization of the inferred ancestry proportions (Q matrix).

Logs are printed to the stdout channel by default. If you want to save them to a file, you can use the command tee along with a pipe:

$ adamixture -k 8 ... | tee run.log

Running with multi-threading

To run ADAMIXTURE using multiple CPU threads, use the -t flag:

$ adamixture -k 8 --data_path data.bed --save_dir out/ --name test -t 8

Running with GPU acceleration

To leverage GPU acceleration (highly recommended for large datasets), use the --device flag:

  • NVIDIA GPU (CUDA):
    $ adamixture -k 8 --data_path data.bed --save_dir out/ --name test --device gpu
    
  • macOS Apple Silicon (MPS):
    $ adamixture -k 8 --data_path data.bed --save_dir out/ --name test --device mps
    

[!TIP] GPU Acceleration: Using GPUs greatly speeds up processing and is highly recommended for large datasets. You can specify the hardware to use with the --device parameter:

  • For NVIDIA GPUs, use --device gpu (requires CUDA).
  • For macOS users with Apple Silicon (M1/M2/M3/M4/M5), use --device mps to enable Metal Performance Shaders (MPS) acceleration.
  • Note that biobank-scale datasets are best handled on dedicated CUDA-capable GPUs due to high RAM requirements.

Multi-K Sweep

Instead of running ADAMIXTURE for a single K, you can automatically sweep over a range of K values using --min_k and --max_k. The data is loaded once, and each K is trained sequentially:

$ adamixture --min_k 2 --max_k 10 --data_path snps_data.bed --save_dir SAVE_PATH --name snps_sweep

Cross-validation

Use --cv to estimate the optimal K by masking a fraction of genotype entries and measuring prediction error. → Full documentation

$ adamixture -k 8 --cv --data_path data.bed --save_dir out/ --name test

Plotting

By default, ADAMIXTURE automatically generates a png plot at 300 DPI without needing any additional flags. → Full documentation

Plots can include hierarchical population labels if you provide the arguments (--labels, --labels2, --labels3).

If you want to customize the format and resolution (e.g., to generate a PDF), you must use the appropriate flag depending on your execution mode:

  • Single K runs (-k): Use --plot_single. Note that --plot will be ignored in single K mode.

    $ adamixture -k 8 --data_path data.bed --save_dir out/ --name test --plot_single pdf 300
    
  • Multi-K sweeps (--min_k and --max_k): Use --plot to configure the combined sweep plot.

    $ adamixture --min_k 2 --max_k 10 --data_path data.bed --save_dir out/ --name test --plot pdf 300
    

Projection Mode

Estimate ancestry proportions for new samples using a pre-trained, fixed P matrix (Q-only optimisation). K is detected automatically from P. → Full documentation

$ adamixture-project \
    --data_path new_samples.bed \
    --p_path trained_model/results.8.P \
    --save_dir projection_out/ \
    --name projected

Supervised Mode

Anchor the model with known population labels for a subset of samples while estimating Q freely for unlabeled ones. Labels use the same format as --labels (population name or -). → Full documentation

$ adamixture-supervised \
    --data_path all_samples.bed \
    --labels labels.txt \
    --save_dir supervised_out/ \
    --name supervised_run \
    -k 8

Other options

All hyperparameters and flags can be explored with:

$ adamixture --help

Key arguments:

Argument Default Description
--init als Initialization method: improved SVD+ALS (als) or random EM priming (em)
--tol 0.1 Convergence tolerance for log-likelihood changes
--max_iter 10000 Maximum optimization iterations
-t 1 Number of CPU threads
-s 42 Random seed
--device cpu Device to use: cpu, gpu, or mps
--chunk_size 8192 Number of SNPs in chunk operations
--chromosome_mode autosomes Chromosome filter: autosomes keeps autosomes 1..--autosome_count; all keeps every chromosome
--autosome_count 22 Number of autosomes kept when --chromosome_mode autosomes
--no_freqs False Do not save the .P allele-frequency matrix

Algorithm note

The ADAMIXTURE preprint introduced Adam-EM as an adaptive first-order optimizer for admixture inference. The package still includes this solver via --algorithm adamem.

In the current implementation, the default is --algorithm brqn. Empirical benchmarking showed that block relaxation with ZAL quasi-Newton acceleration, when paired with our improved SVD+ALS initialization, reaches high-quality solutions in fewer iterations and better wall-clock time. For that reason, BR-QN is the default solver, while Adam-EM remains available for experimentation and reproducibility. Adam-EM tuning parameters are documented in Troubleshooting and Tips.

Troubleshooting and Tips

Full documentation

License

This project is licensed under the BSD 3-Clause License - see the LICENSE file for details.

Cite

When using this software, please cite the following preprint:

@article{saurina2026adamixture,
  title={ADAMIXTURE: Adaptive First-Order Optimization for Biobank-Scale Genetic Clustering},
  author={Saurina-i-Ricos, Joan and Mas Monserrat, Daniel and Ioannidis, Alexander G.},
  journal={bioRxiv},
  year={2026},
  doi={10.64898/2026.02.13.700171},
  url={https://doi.org/10.64898/2026.02.13.700171}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

adamixture-1.7.2.tar.gz (7.9 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

adamixture-1.7.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

adamixture-1.7.2-cp312-cp312-macosx_14_0_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.12macOS 14.0+ x86-64

adamixture-1.7.2-cp312-cp312-macosx_14_0_arm64.whl (1.8 MB view details)

Uploaded CPython 3.12macOS 14.0+ ARM64

adamixture-1.7.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

adamixture-1.7.2-cp311-cp311-macosx_14_0_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.11macOS 14.0+ x86-64

adamixture-1.7.2-cp311-cp311-macosx_14_0_arm64.whl (1.8 MB view details)

Uploaded CPython 3.11macOS 14.0+ ARM64

adamixture-1.7.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

adamixture-1.7.2-cp310-cp310-macosx_14_0_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.10macOS 14.0+ x86-64

adamixture-1.7.2-cp310-cp310-macosx_14_0_arm64.whl (1.8 MB view details)

Uploaded CPython 3.10macOS 14.0+ ARM64

File details

Details for the file adamixture-1.7.2.tar.gz.

File metadata

  • Download URL: adamixture-1.7.2.tar.gz
  • Upload date:
  • Size: 7.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for adamixture-1.7.2.tar.gz
Algorithm Hash digest
SHA256 16ffdb1fafdac2ded164acb64c3d08fc72110c8710b413575f93d2eb55a55b13
MD5 f9bd775ae17566dcbdbf683743d776fe
BLAKE2b-256 8380bb64957b69fccb2d464bb8627f8ebf428bde1e363f5ce7d348169e1a96f8

See more details on using hashes here.

File details

Details for the file adamixture-1.7.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 dfc0c6000ee96dc76f997a75afdf7d999a87c9cfa2e926d50f71ca077c8a1ba7
MD5 2c9a34756539dc11ebb20c4a9ffb8b26
BLAKE2b-256 f90d5e0137cd4a0c66f94e4e5c900b18b8f0233445afd0ee81e2078354da9c9c

See more details on using hashes here.

File details

Details for the file adamixture-1.7.2-cp312-cp312-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.2-cp312-cp312-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 cc2151600059050541c9c0f86240ceda95d2e94b3b6efc63c3b743ed1da032fe
MD5 cea5867354c321234f11e6d1db832040
BLAKE2b-256 28af7443442a8fb8879f9933078f0f0d966271355faba2638adad4dba92c43f5

See more details on using hashes here.

File details

Details for the file adamixture-1.7.2-cp312-cp312-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.2-cp312-cp312-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 2bad448e1f1f2dcbf594cdc689eb705b184fe7ebfe51f134f95a78c50ece42bb
MD5 d4a2a71aa011a6f93c2699d88699b143
BLAKE2b-256 8f30c994a2abb30cc52ec34329693d7b7243ed18306b6c127b4f13e55da64cb1

See more details on using hashes here.

File details

Details for the file adamixture-1.7.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1d9056452d4775c825c6a1992e9e3f5351c1304599b842b93b7ace11f6546772
MD5 445f35ecf4444903dfa8977d565bf8af
BLAKE2b-256 25b31ba47f47dd40bc7f6c896a6bfdb5799eb3fa4f6acfe1eef240b74437e903

See more details on using hashes here.

File details

Details for the file adamixture-1.7.2-cp311-cp311-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.2-cp311-cp311-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 877aaa2e2626b2784772739d8487ebcc75eba7d4024841a6c54a538d017b6569
MD5 2af27d385b117c2755c0f66aa2bb84a3
BLAKE2b-256 fc9aeaed1a7bb609fb030c68e88a3990b5cedae550ada959d313350695e6c807

See more details on using hashes here.

File details

Details for the file adamixture-1.7.2-cp311-cp311-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.2-cp311-cp311-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 f7b282f4acbe33a100521dc2c71a7f826ac736ffad65cef25741df49574408c5
MD5 40fd197473661a6b1a1b5c6f248fa051
BLAKE2b-256 54a9b38a08d6fefa53e54da251b502ded1f7e6043692ff9a2de3722bd94d1c46

See more details on using hashes here.

File details

Details for the file adamixture-1.7.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b34e083e134f99b74d677116fb84a0b71e6faa7a8b0a7e73754a3d33d6134f84
MD5 5395cc67883566d504e6b95509711d56
BLAKE2b-256 719127ed9ed40ac97326ee5e0076c70cad486dafca4911793668185db445f6b2

See more details on using hashes here.

File details

Details for the file adamixture-1.7.2-cp310-cp310-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.2-cp310-cp310-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 ccf14b2a5c51bdc7885b713afffe682f8bccf450849b8bf3099424491556fd7d
MD5 35e153574e1f006991e9c6d4f1a5e287
BLAKE2b-256 27854a31a170ea145ccca0e6a731d36ec30ebbe63f07a796d4447f7645dcd87a

See more details on using hashes here.

File details

Details for the file adamixture-1.7.2-cp310-cp310-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.2-cp310-cp310-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 a6d01532dc8c463dcc24a437843797f08a534053ce34c35bf8e23c564066f70e
MD5 c934234081c181dc4baa380e812a356f
BLAKE2b-256 b63dbc18104c78c19c912298d684f8cb16bff5566137176cc9f45b81d8f95594

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page