EM-PCA for inferring population structure in the presence of missingness
Project description
EMU
EMU is a software for performing principal component analysis (PCA) in the presence of missingness for genetic datasets. EMU can handle both random and non-random missingness by modelling it directly through a truncated SVD approach. EMU uses binary PLINK files as input.
Citation
Please cite our paper in Bioinformatics: https://doi.org/10.1093/bioinformatics/btab027
Installation
# Build and install via PyPI
pip install emu-popgen
# Download source and install via pip
git clone https://github.com/Rosemeis/emu.git
cd emu
pip install .
# Download source and install in new Conda environment
git clone https://github.com/Rosemeis/emu.git
conda env create -f environment.yml
conda activate emu
# You can now run the program with the `emu` command
Quick usage
Running EMU
Provide emu
with the file prefix of the PLINK files.
# Check help message of the program
emu -h
# Model and extract 2 eigenvectors using the EM-PCA algorithm
emu --bfile test --eig 2 --threads 64 --out test.emu
Memory efficient implementation
A more memory efficient implementation has been added. It is based of the randomized SVD algorithm using custom matrix multiplications that can handle decomposed matrices. Only factor matrices as well as the 2-bit genotype matrix is kept in memory.
# Example run using '--mem' argument
emu --mem --bfile test -eig 2 -threads 64 -out test.emu.mem
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file emu_popgen-1.1.tar.gz
.
File metadata
- Download URL: emu_popgen-1.1.tar.gz
- Upload date:
- Size: 329.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1f7c1d611277347a1f019201edc5a1ee0e648567ec37f6060dcb0e73d6913208 |
|
MD5 | 8a2f120047655b193429b99eef6f5608 |
|
BLAKE2b-256 | cc58f20aa183f54c49ffc095159eab68080cfe68515211392b674de544c07778 |
File details
Details for the file emu_popgen-1.1-cp311-cp311-macosx_11_0_arm64.whl
.
File metadata
- Download URL: emu_popgen-1.1-cp311-cp311-macosx_11_0_arm64.whl
- Upload date:
- Size: 183.3 kB
- Tags: CPython 3.11, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5308a484be96ec092f2fc665a29059eca7a59f3faca4b5142f7d7e456b177720 |
|
MD5 | a9671c2be1685d66cd396d6186c75f0a |
|
BLAKE2b-256 | bb9f5435519901168fccec5d9b172e7dde18f735162e256005b4e5bb605859d8 |