Skip to main content

ADAMIXTURE: Adaptive First-Order Optimization for Biobank-Scale Ancestry Inference

Project description

PyPI - Python Version PyPI - Version PyPI - License PyPI - Status PyPI - Downloads DOI

ADAMIXTURE: Adaptive First-Order Optimization for Biobank-Scale Ancestry Inference

ADAMIXTURE is an unsupervised global ancestry inference method that scales the ADMIXTURE model to biobank-sized datasets. It combines the Expectation–Maximization (EM) framework with the ADAM first-order optimizer, enabling parameter updates after a single EM step. This approach accelerates convergence while maintaining comparable or improved accuracy, substantially reducing runtime on large genotype datasets. For more information, we recommend reading our pre-print.

The software can be invoked via CLI and has a similar interface to ADMIXTURE (e.g. the output format is completely interchangeable).

nadm_mna

System requirements

Hardware requirements

The successful usage of this package requires a computer with enough RAM to be able to handle the large datasets the network has been designed to work with. Due to this, we recommend using compute clusters whenever available to avoid memory issues.

Software requirements

We recommend creating a fresh Python 3.10 virtual environment using virtualenv (or conda), and then install the package adamixture there. As an example, for virtualenv, one should launch the following commands:

$ virtualenv --python=python3.9 ~/venv/nadmenv
$ source ~/venv/nadmenv/bin/activate
(nadmenv) $ pip install adamixture

Installation Guide

The package can be easily installed in at most a few minutes using pip (make sure to add the --upgrade flag if updating the version):

(nadmenv) $ pip install adamixture

Running ADAMIXTURE

To train a model, simply invoke the following commands from the root directory of the project. For more info about all the arguments, please run adamixture --help. Note that VCF and BED are supported as of now:

As an example, the following ADMIXTURE call

$ ./admixture snps_data.bed 8 -s 42

would be mimicked in ADAMIXTURE by running

$ adamixture --k 8 --data_path snps_data.bed --save_dir SAVE_PATH --init_file INIT_FILE --name snps_data --seed 42

Two files will be output to the SAVE_PATH directory (the name parameter will be used to create the whole filenames):

  • A .P file, similar to ADMIXTURE.
  • A .Q file, similar to ADMIXTURE.

Logs are printed to the stdout channel by default. If you want to save them to a file, you can use the command tee along with a pipe:

$ adamixture --k 8 ... | tee run.log

Other options

  • --lr (float, default: 0.005):
    Learning rate used by the Adam optimizer in the EM updates.

  • --min_lr (float, default: 1e-6):
    Minimum learning rate used by the Adam optimizer in the EM updates.

  • --lr_decay (float, default: 0.5):
    Learning rate decay factor.

  • --beta1 (float, default: 0.80):
    Exponential decay rate for the first moment estimates in Adam.

  • --beta2 (float, default: 0.88):
    Exponential decay rate for the second moment estimates in Adam.

  • --reg_adam (float, default: 1e-8):
    Numerical stability constant (epsilon) for the Adam optimizer.

  • --seed (int, default: 42):
    Random number generator seed for reproducibility.

  • --k (int, required):
    Number of ancestral populations (clusters) to infer.

  • --max_iter (int, default: 1500):
    Maximum number of Adam-EM iterations.

  • --check (int, default: 5):
    Frequency (in iterations) at which the log-likelihood is evaluated.

  • --max_als (int, default: 1000):
    Maximum number of iterations for the ALS solver.

  • --tole_als (float, default: 1e-4):
    Convergence tolerance for the ALS optimization.

  • --reg_als (float, default: 1e-5):
    Regularization parameter for ALS.

  • --power (int, default: 5):
    Number of power iterations used in randomized SVD (RSVD).

  • --tole_svd (float, default: 1e-1):
    Convergence tolerance for the SVD approximation.

  • --threads (int, default: 1):
    Number of CPU threads used during execution.

License

NOTICE: This software is available for use free of charge for academic research use only. Academic users may fork this repository and modify and improve to suit their research needs, but also inherit these terms and must include a licensing notice to that effect. Commercial users, for profit companies or consultants, and non-profit institutions not qualifying as "academic research" should contact the authors for a separate license. This applies to this repository directly and any other repository that includes source, executables, or git commands that pull/clone this repository as part of its function. Such repositories, whether ours or others, must include this notice.

Cite

When using this software, please cite the following pre-print:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

adamixture-1.0.0.tar.gz (3.2 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

adamixture-1.0.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (859.2 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

adamixture-1.0.0-cp312-cp312-macosx_14_0_x86_64.whl (748.9 kB view details)

Uploaded CPython 3.12macOS 14.0+ x86-64

adamixture-1.0.0-cp312-cp312-macosx_14_0_arm64.whl (974.3 kB view details)

Uploaded CPython 3.12macOS 14.0+ ARM64

adamixture-1.0.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (856.8 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

adamixture-1.0.0-cp311-cp311-macosx_14_0_x86_64.whl (741.6 kB view details)

Uploaded CPython 3.11macOS 14.0+ x86-64

adamixture-1.0.0-cp311-cp311-macosx_14_0_arm64.whl (968.5 kB view details)

Uploaded CPython 3.11macOS 14.0+ ARM64

adamixture-1.0.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (870.0 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

adamixture-1.0.0-cp310-cp310-macosx_14_0_x86_64.whl (745.0 kB view details)

Uploaded CPython 3.10macOS 14.0+ x86-64

adamixture-1.0.0-cp310-cp310-macosx_14_0_arm64.whl (973.3 kB view details)

Uploaded CPython 3.10macOS 14.0+ ARM64

File details

Details for the file adamixture-1.0.0.tar.gz.

File metadata

  • Download URL: adamixture-1.0.0.tar.gz
  • Upload date:
  • Size: 3.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for adamixture-1.0.0.tar.gz
Algorithm Hash digest
SHA256 2caa6c9e88b1ef800a970d64ed60bc14070487c7abefb01f96f2ec8b05e9cb9f
MD5 cdab2126944d9b363847d0b38822adec
BLAKE2b-256 85c965796a704dea70377f7884a6cbb62ed9849a981976c86f5810d0c16ac543

See more details on using hashes here.

File details

Details for the file adamixture-1.0.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.0.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 220edba266496eafd67b72aa9eb1bd1f76b7f42680ec1fc136bfc2d8425b8e09
MD5 f4408e3d69eb39fca3018b55d701646d
BLAKE2b-256 e723424271ef47a867a4a536b7e209d12438875e79a27fa7625233b1154437c8

See more details on using hashes here.

File details

Details for the file adamixture-1.0.0-cp312-cp312-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.0.0-cp312-cp312-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 7aa09c60a9a087faff5e814e64e0c0094a49171d970f325a51179add8957b02c
MD5 033b35414d18a1fe03ef68a0cae945ce
BLAKE2b-256 6463d875f1ac655a08c158f843fe12592673206abf1055c347e05155e9699f8d

See more details on using hashes here.

File details

Details for the file adamixture-1.0.0-cp312-cp312-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for adamixture-1.0.0-cp312-cp312-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 202e0249f50fe349697ebbc21120d2938728a3ab891509596d5bb8ca75d4699d
MD5 b34ea59ef12a1cb0d416eae3654f6705
BLAKE2b-256 3a2424f7bd4a43f4908c1117a0942fccba410ab45c6acf145bd4e95447b51d95

See more details on using hashes here.

File details

Details for the file adamixture-1.0.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.0.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a0e9fe8b748e05e52ffa9e8be99a756c83e9a0838b7e1f5ef31e1727d5f9207b
MD5 7ed2d9028d0de2da718cf8e980e40710
BLAKE2b-256 96fff5d01cd0f869a84d8522580d1f2e8b0cd4c96d8bd82366671686151ad5cc

See more details on using hashes here.

File details

Details for the file adamixture-1.0.0-cp311-cp311-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.0.0-cp311-cp311-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 0a0412f82d3d237e9e4794f68165552a96a61ff4ec9585b643bb962e5c7985de
MD5 d26a283b32c202e51cb68fdfc7752c98
BLAKE2b-256 bba3fb621f5a83a7d88655ffcb1aab4b47f6e7f1f2a00992711ae9184f6b6503

See more details on using hashes here.

File details

Details for the file adamixture-1.0.0-cp311-cp311-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for adamixture-1.0.0-cp311-cp311-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 1d7f7fccdb97c298def41a7c11c6250a2b33fbddbf730541b3fe7396fd50ab22
MD5 8727a9ff2a505e1c8d15bda102482a99
BLAKE2b-256 5bf010343adbddf0286ec2ec32522f5625b62b709124665ed67a46e95ce35e2b

See more details on using hashes here.

File details

Details for the file adamixture-1.0.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.0.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 dc553e361eb75940ecdb7461427b208d5d8d6246f1dd2eb4f0213cdb75f86613
MD5 6bf59f87a2a5b66b269c329c44d60760
BLAKE2b-256 e3773d33cadd4c496a97153ab38092d107d998ae220193a89678ef935d6097ce

See more details on using hashes here.

File details

Details for the file adamixture-1.0.0-cp310-cp310-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for adamixture-1.0.0-cp310-cp310-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 725103e419e44e23c7fe0749bb3e47d71a8aabe39c62e61a1a8b712a5febd279
MD5 3a6f1f4076d3354e9c131016118c8440
BLAKE2b-256 c1b11b1efadfdbfdacf6809b8c846650030e3bb44d4a3a1e40f3b82dd0cc9dbb

See more details on using hashes here.

File details

Details for the file adamixture-1.0.0-cp310-cp310-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for adamixture-1.0.0-cp310-cp310-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 e46c4531d31710dc929057da2ed4b535732ad5fef29f41394140479c725718c6
MD5 acab15a1367e6100dbdb7fd208393a76
BLAKE2b-256 67c476ca968caa1f291c78bc7f53d7ca56324efc8504ed9e918abc4b4dbf23a0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page