Skip to main content

Fast Python/Cython implementation of the PCAone Halko algorithm

Project description

Cython/Python implementation of Halko algorithm

This is a fast implementation of Halko algorithm in Python/Cython for genotype data. It takes binary PLINK format (*.bed, *.bim, *.fam) as input. For simplicity, mean imputation is performed for missing data.

It is inspired by the lovely PCAone software! Have a look here.

Installation

# Option 1: Build and install via PyPI
pip install halkoSVD

# Option 2: Download source and install via pip
git clone https://github.com/Rosemeis/halkoSVD.git
cd halkoSVD
pip install .

# Option 3: Download source and install in a new Conda environment
git clone https://github.com/Rosemeis/halkoSVD.git
conda env create -f halkoSVD/environment.yml
conda activate halkoSVD

You can now run the program with the halkoSVD command.

Quick usage

Provide halkoSVD with the file prefix of the PLINK files.

# Check help message of the program
halkoSVD -h

# Extract the top 10 PCs
halkoSVD --bfile input --threads 32 --pca 10 --out halko

Options

  • --seed, set random seed for reproducibility (42)
  • --power, specify the number of power iterations (10)
  • --batch, specify the batch size to process SNPs (8192)
  • --loadings, save the SNP loadings
  • --raw, only output eigenvectors without FID/IID

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

halkosvd-0.5.0.tar.gz (167.5 kB view details)

Uploaded Source

File details

Details for the file halkosvd-0.5.0.tar.gz.

File metadata

  • Download URL: halkosvd-0.5.0.tar.gz
  • Upload date:
  • Size: 167.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for halkosvd-0.5.0.tar.gz
Algorithm Hash digest
SHA256 72efca4a19930e718d775fc31c8c856047e0cbfc3ba20daaea15e68be96342a5
MD5 3197df6ee2b8c0a1b6a459295f0cf2df
BLAKE2b-256 adb03e316f057b58dac9c9c66343d030d24a2c23470591d312c7ab0d2d6c49e1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page