Skip to main content

ADAMIXTURE: Adaptive First-Order Optimization for Biobank-Scale Ancestry Inference

Project description

ADAMIXTURE logo

Fast Biobank-Scale Population Genetics Clustering

Python Version PyPI Version License Status Downloads


ADAMIXTURE is a fast CPU/GPU implementation of ADMIXTURE for biobank-scale genetic clustering. .P and .Q outputs remain compatible with ADMIXTURE.

System requirements

Hardware requirements

The successful usage of this package requires a computer with enough RAM to be able to handle the large datasets the network has been designed to work with. Due to this, we recommend using compute clusters whenever available to avoid memory issues.

Software requirements

We recommend creating a fresh Python 3.10+ virtual environment. For a faster installation experience, we highly recommend using uv.

[!IMPORTANT]
If you plan to use GPU acceleration, ensure that the CUDA toolkit is correctly loaded (e.g., module load cuda) before starting the installation. This ensures that the dependencies and internal components are correctly configured for your hardware.

As an example, using uv (recommended):

$ uv venv --python 3.10
$ source .venv/bin/activate
$ uv pip install adamixture

Installation Guide

The package can be easily installed in at most a few minutes using pip (make sure to add the --upgrade flag if updating the version):

$ pip install adamixture

Running ADAMIXTURE

To train a model, simply invoke the following commands from the root directory of the project. For more info about all the arguments, please run adamixture --help. Note that BED, VCF and PGEN are supported.

As an example, the following ADMIXTURE call

$ ./admixture snps_data.bed 8 -s 42

would be equivalent in ADAMIXTURE by running

$ adamixture -k 8 --data_path snps_data.bed --save_dir SAVE_PATH --name snps_data -s 42

By default, the following files will be output to the SAVE_PATH directory (the name parameter will be used to create the full filenames):

  • A .P file, similar to ADMIXTURE.
  • A .Q file, similar to ADMIXTURE.
  • A .png plot file containing the visualization of the inferred ancestry proportions (Q matrix).

Logs are printed to the stdout channel by default. If you want to save them to a file, you can use the command tee along with a pipe:

$ adamixture -k 8 ... | tee run.log

Running with multi-threading

To run ADAMIXTURE using multiple CPU threads, use the -t flag:

$ adamixture -k 8 --data_path data.bed --save_dir out/ --name test -t 8

Running with GPU acceleration

To leverage GPU acceleration (highly recommended for large datasets), use the --device flag:

  • NVIDIA GPU (CUDA):
    $ adamixture -k 8 --data_path data.bed --save_dir out/ --name test --device gpu
    
  • macOS Apple Silicon (MPS):
    $ adamixture -k 8 --data_path data.bed --save_dir out/ --name test --device mps
    

[!TIP] GPU Acceleration: Using GPUs greatly speeds up processing and is highly recommended for large datasets. You can specify the hardware to use with the --device parameter:

  • For NVIDIA GPUs, use --device gpu (requires CUDA).
  • For macOS users with Apple Silicon (M1/M2/M3/M4/M5), use --device mps to enable Metal Performance Shaders (MPS) acceleration.
  • Note that biobank-scale datasets are best handled on dedicated CUDA-capable GPUs due to high RAM requirements.

Multi-K Sweep

Instead of running ADAMIXTURE for a single K, you can automatically sweep over a range of K values using --min_k and --max_k. The data is loaded once, and each K is trained sequentially:

$ adamixture --min_k 2 --max_k 10 --data_path snps_data.bed --save_dir SAVE_PATH --name snps_sweep

Cross-validation

Use --cv to estimate the optimal K by masking a fraction of genotype entries and measuring prediction error. → Full documentation

$ adamixture -k 8 --cv --data_path data.bed --save_dir out/ --name test

Plotting

By default, ADAMIXTURE automatically generates a png plot at 300 DPI without needing any additional flags. → Full documentation

Plots can include hierarchical population labels if you provide the arguments (--labels, --labels2, --labels3).

If you want to customize the format and resolution (e.g., to generate a PDF), you must use the appropriate flag depending on your execution mode:

  • Single K runs (-k): Use --plot_single. Note that --plot will be ignored in single K mode.

    $ adamixture -k 8 --data_path data.bed --save_dir out/ --name test --plot_single pdf 300
    
  • Multi-K sweeps (--min_k and --max_k): Use --plot to configure the combined sweep plot.

    $ adamixture --min_k 2 --max_k 10 --data_path data.bed --save_dir out/ --name test --plot pdf 300
    

Projection Mode

Estimate ancestry proportions for new samples using a pre-trained, fixed P matrix (Q-only optimisation). K is detected automatically from P. → Full documentation

$ adamixture-project \
    --data_path new_samples.bed \
    --p_path trained_model/results.8.P \
    --save_dir projection_out/ \
    --name projected

Supervised Mode

Anchor the model with known population labels for a subset of samples while estimating Q freely for unlabeled ones. Labels use the same format as --labels (population name or -). → Full documentation

$ adamixture-supervised \
    --data_path all_samples.bed \
    --labels labels.txt \
    --save_dir supervised_out/ \
    --name supervised_run \
    -k 8

Other options

All hyperparameters and flags can be explored with:

$ adamixture --help

Key arguments:

Argument Default Description
--init als Initialization method: improved SVD+ALS (als) or random EM priming (em)
--tol 0.1 Convergence tolerance for log-likelihood changes
--max_iter 10000 Maximum optimization iterations
-t 1 Number of CPU threads
-s 42 Random seed
--device cpu Device to use: cpu, gpu, or mps
--chunk_size 8192 Number of SNPs in chunk operations
--chromosome_mode autosomes Chromosome filter: autosomes keeps autosomes 1..--autosome_count; all keeps every chromosome
--autosome_count 22 Number of autosomes kept when --chromosome_mode autosomes
--no_freqs False Do not save the .P allele-frequency matrix

Algorithm note

The ADAMIXTURE preprint introduced Adam-EM as an adaptive first-order optimizer for admixture inference. The package still includes this solver via --algorithm adamem.

In the current implementation, the default is --algorithm brqn. Empirical benchmarking showed that block relaxation with ZAL quasi-Newton acceleration, when paired with our improved SVD+ALS initialization, reaches high-quality solutions in fewer iterations and better wall-clock time. For that reason, BR-QN is the default solver, while Adam-EM remains available for experimentation and reproducibility. Adam-EM tuning parameters are documented in Troubleshooting and Tips.

Troubleshooting and Tips

Full documentation

License

This project is licensed under the BSD 3-Clause License - see the LICENSE file for details.

Cite

When using this software, please cite the following preprint:

@article{saurina2026adamixture,
  title={ADAMIXTURE: Adaptive First-Order Optimization for Biobank-Scale Genetic Clustering},
  author={Saurina-i-Ricos, Joan and Mas Monserrat, Daniel and Ioannidis, Alexander G.},
  journal={bioRxiv},
  year={2026},
  doi={10.64898/2026.02.13.700171},
  url={https://doi.org/10.64898/2026.02.13.700171}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

adamixture-1.7.3.tar.gz (7.9 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

adamixture-1.7.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

adamixture-1.7.3-cp312-cp312-macosx_14_0_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.12macOS 14.0+ x86-64

adamixture-1.7.3-cp312-cp312-macosx_14_0_arm64.whl (1.8 MB view details)

Uploaded CPython 3.12macOS 14.0+ ARM64

adamixture-1.7.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

adamixture-1.7.3-cp311-cp311-macosx_14_0_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.11macOS 14.0+ x86-64

adamixture-1.7.3-cp311-cp311-macosx_14_0_arm64.whl (1.8 MB view details)

Uploaded CPython 3.11macOS 14.0+ ARM64

adamixture-1.7.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

adamixture-1.7.3-cp310-cp310-macosx_14_0_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.10macOS 14.0+ x86-64

adamixture-1.7.3-cp310-cp310-macosx_14_0_arm64.whl (1.8 MB view details)

Uploaded CPython 3.10macOS 14.0+ ARM64

File details

Details for the file adamixture-1.7.3.tar.gz.

File metadata

  • Download URL: adamixture-1.7.3.tar.gz
  • Upload date:
  • Size: 7.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for adamixture-1.7.3.tar.gz
Algorithm Hash digest
SHA256 6ec9f6e36c9088a20edc42ae9b3b3e736ee408c894c8ea84e6a6c58f6e644dd7
MD5 dca70ec6464b09f351dc83b441604524
BLAKE2b-256 c7f277839634d0ad41e79e0a641f932b3c598a1cd7ac0d4f9a731f1d8c30d14c

See more details on using hashes here.

File details

Details for the file adamixture-1.7.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 227027f5b3f3e8d732a43cb5b9d3a5e8583416b6af2608e168fec333f17f72bc
MD5 d07362b9a7bf934b46146efc4547a817
BLAKE2b-256 92eeab45a785e644d436740d855a3ed7432f3b3f934d396b648a0f5a08bae39f

See more details on using hashes here.

File details

Details for the file adamixture-1.7.3-cp312-cp312-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.3-cp312-cp312-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 33b30793fa9f41ab7540e4fe35ce1e8ce9b8fbe2fccd333fb0f1058cbfe0b7cf
MD5 4a5729bea54ec8dbcfadc3e3c80851da
BLAKE2b-256 6f99ef19a23fea43036164e6a1bd7d3a254baeb79519d07200363562acc0905c

See more details on using hashes here.

File details

Details for the file adamixture-1.7.3-cp312-cp312-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.3-cp312-cp312-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 658bbb50c14c7ee2792451b6b32d731e69533e772fafdcc32ff7735627fdbcc5
MD5 e51f531f614ff099e66c66c663ba5bcd
BLAKE2b-256 6f0c59d198e8074a1aeecbb24532d2a8f4400e4f3c454d16748861e0187fffb9

See more details on using hashes here.

File details

Details for the file adamixture-1.7.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8e752930eb7c542d6e4239a12316a9bdefaa5126fd14edd339a52245f247ea1d
MD5 7338fe5dfb18443b834597973b6f2369
BLAKE2b-256 9651826d82e50199cf781bf0f3b4d463c416ad0279f794b1b205bfc9b92dcb2f

See more details on using hashes here.

File details

Details for the file adamixture-1.7.3-cp311-cp311-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.3-cp311-cp311-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 e31af2cc7f95f355af8ba6351aa73ef8c89c8acc8b52a4bc650c615440e6eca7
MD5 e9c346c6d807ca82ffe987d189eda7c0
BLAKE2b-256 8ef71ebac61ddea0e4e38437452b3da2e4565fb02c052079ade5a45f985ae1c1

See more details on using hashes here.

File details

Details for the file adamixture-1.7.3-cp311-cp311-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.3-cp311-cp311-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 158e532baf86ec0e24faf1809dc148140f8abbb7f693ac12c34f114afc17ebb1
MD5 6712dd2eaaea82ee0e05b565842a0770
BLAKE2b-256 c59d67d53bca23b62d0f1a18db1b863581ca851a36a533955a7df1a15c124f61

See more details on using hashes here.

File details

Details for the file adamixture-1.7.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b278d4826d0aced6e18adac646c36ef805fbaaf854bfb826af9c1847d3d061c7
MD5 1c25f43580ed54348680a093936ae6b0
BLAKE2b-256 5e5a448ca3835a06f5f721c83ffff51e8d4e0216c207036fcc235d0f8339ab85

See more details on using hashes here.

File details

Details for the file adamixture-1.7.3-cp310-cp310-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.3-cp310-cp310-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 3f1838d345a6dcde06cce51387cd2e489134f5f1d3d661f91180df8a5e161949
MD5 fc8d30a98a4b0e4f74180d66a445fb16
BLAKE2b-256 f12ce99b3999f2d0ebafab1129d907c9719c2df142d47e5b90b6fff0ea8298d1

See more details on using hashes here.

File details

Details for the file adamixture-1.7.3-cp310-cp310-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for adamixture-1.7.3-cp310-cp310-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 2db2a54d2ebcd00a10c0f945709f89565f502c4b53b02e58445036c0c6f4420a
MD5 6f667bd66e87569cd6061d9ba22594a6
BLAKE2b-256 92aeb62aea90330445d73ac2e92e42c5af25e8aa36feb4b5e3aa6c61d4b17fb2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page