Skip to main content

Fast Python/Cython implementation of the PCAone Halko algorithm

Project description

Cython/Python implementation of Halko algorithm

This is a fast implementation of Halko algorithm in Python/Cython for genotype data. It takes binary PLINK format (*.bed, *.bim, *.fam) as input. For simplicity, mean imputation is performed for missing data.

It is inspired by the lovely PCAone software! Have a look here.

Installation

# Option 1: Build and install via PyPI
pip install halkoSVD

# Option 2: Download source and install via pip
git clone https://github.com/Rosemeis/halkoSVD.git
cd halkoSVD
pip install .

# Option 3: Download source and install in a new Conda environment
git clone https://github.com/Rosemeis/halkoSVD.git
conda env create -f halkoSVD/environment.yml
conda activate halkoSVD

You can now run the program with the halkoSVD command.

Quick usage

Provide halkoSVD with the file prefix of the PLINK files.

# Check help message of the program
halkoSVD -h

# Extract the top 10 PCs
halkoSVD --bfile input --threads 32 --pca 10 --out halko

Options

  • --seed, set random seed for reproducibility (42)
  • --power, specify the number of power iterations (10)
  • --batch, specify the batch size to process SNPs (8192)
  • --loadings, save the SNP loadings
  • --raw, only output eigenvectors without FID/IID

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

halkosvd-0.4.1.tar.gz (166.8 kB view details)

Uploaded Source

File details

Details for the file halkosvd-0.4.1.tar.gz.

File metadata

  • Download URL: halkosvd-0.4.1.tar.gz
  • Upload date:
  • Size: 166.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.8

File hashes

Hashes for halkosvd-0.4.1.tar.gz
Algorithm Hash digest
SHA256 dd1362ff70f2a8b5f567895571403925416cf947056674efa7b2cc106abe8d25
MD5 11e5706b29f4fd19dec810dd4b6bd76d
BLAKE2b-256 663fd530d4f216e5470b5fec0beb18b1c8b2838078fbda8a1c49be9c69b77ee7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page