Fast Python/Cython implementation of the PCAone Halko algorithm
Project description
Python implementation of Halko algorithm (PCAone)
This is an implementation of the PCAone Halko algorithm in Python/Cython for genetic data. It takes binary PLINK format (*.bed, *.bim, *.fam) as input. For simplicity, mean imputation is performed for missing data.
It is inspired by the lovely PCAone software! Have a look here.
Install and build
# Install via PyPI
pip3 install halkoSVD
# Download and install in a new Conda environment
conda env create --file environment.yml
# Download and install from GitHub directly
git clone https://github.com/Rosemeis/halkoSVD.git
cd halkoSVD
pip3 install .
# You can now run the program with the `halkoSVD` command
Quick usage
Provide halkoSVD
with the file prefix of the PLINK files.
# Check help message of the program
halkoSVD -h
# Extract top 10 PCs with a mini-batch size of 8192 SNPs
halkoSVD --bfile input --threads 32 --pca 10 --batch 8192 --out halko
# Increase power iterations to 16
halkoSVD --bfile input --threads 32 --pca 10 --power 16 --out halko
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
halkosvd-0.2.2.tar.gz
(164.9 kB
view details)
File details
Details for the file halkosvd-0.2.2.tar.gz
.
File metadata
- Download URL: halkosvd-0.2.2.tar.gz
- Upload date:
- Size: 164.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d45401f0070ac1b93ee1c62d687c19ba5e00c9c9d82dd5e347f36a238ac2a9e7 |
|
MD5 | c9c0941f9e78b00d5fb33d3c9737da5e |
|
BLAKE2b-256 | 4e5812e7322254f397bd6011ad0c060f52ef86c5ed1c8f1ade769922f64097b7 |