Skip to main content

Fast Python/Cython implementation of the PCAone Halko algorithm

Project description

Python implementation of Halko algorithm (PCAone)

This is a fast implementation of the PCAone (H+Y) Halko algorithm in Python/Cython for genetic data. It takes binary PLINK format (*.bed, *.bim, *.fam) as input. For simplicity, mean imputation is performed for missing data.

It is inspired by the lovely PCAone software! Have a look here.

Installation

# Vuild and install via PyPI
pip install halkoSVD

# Download source and install via pip
git clone https://github.com/Rosemeis/halkoSVD.git
cd halkoSVD
pip install .

# Download source and install in new Conda environment
git clone https://github.com/Rosemeis/halkoSVD.git
conda env create -f environment.yml
conda activate halkoSVD


# You can now run the program with the `halkoSVD` command

Quick usage

Provide halkoSVD with the file prefix of the PLINK files.

# Check help message of the program
halkoSVD -h

# Extract the top 10 PCs
halkoSVD --bfile input --threads 32 --pca 10 --out halko

Options

  • --power, specify the number of power iterations (12)
  • --extra, number of extra vectors for oversampling (16)
  • --batch, specify the batch size to process SNPs (4096)
  • --full, load the entire genotype matrix into matrix
  • --loadings, save the SNP loadings
  • --raw, only output eigenvectors without FID/IID

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

halkosvd-0.2.4.tar.gz (167.2 kB view details)

Uploaded Source

Built Distribution

halkoSVD-0.2.4-cp311-cp311-macosx_11_0_arm64.whl (88.9 kB view details)

Uploaded CPython 3.11 macOS 11.0+ ARM64

File details

Details for the file halkosvd-0.2.4.tar.gz.

File metadata

  • Download URL: halkosvd-0.2.4.tar.gz
  • Upload date:
  • Size: 167.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.8

File hashes

Hashes for halkosvd-0.2.4.tar.gz
Algorithm Hash digest
SHA256 55ad4532cb39789aac0f668b3f553df5f07804502cee1cea8a7bfab7278250c5
MD5 951869e9404d631db3a74275750ac3f3
BLAKE2b-256 1d7d1ab28169879451a1e9d2cb1cd904dce389da0f2ffffa0424c5f877b63f15

See more details on using hashes here.

File details

Details for the file halkoSVD-0.2.4-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for halkoSVD-0.2.4-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 dd9848cedc820a40143c50a009cfeb2c05678b7c75ddc80780b19cf492509e4e
MD5 3cec69dd15ba5a907dbdb1ae92e76a2e
BLAKE2b-256 8eb9ba3bd9b9fbfdc7447c6c4340fcfa818573013d15336ee809b9b3c78948f5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page