Dask interface for memory-efficient reading PLINK genotype files
Project description
dask-pgen
Lazy loading PLINK genotype data (.pgen or .bed) file with dask-array.
Installation
pip install -U cython numpy
git clone https://github.com/KangchengHou/dask-pgen.git
cd dask-pgen; pip install -e .
Or,
pip install git+https://github.com/KangchengHou/dask-pgen.git
Example
# if dapgen is not found, set the path of dapgen yourself
# chmod +x dask-pgen/bin/dapgen
# dapgen=dask-pgen/bin/dapgen
# replace "dapgen" with "$dapgen"
dapgen score \
--plink <plink> \
--weights <weights_path> \
--weight-col-prefix <weight_column_prefix> \
--chrom-col CHR --pos-col POS --alt-col A1 --ref-col A2 \
--out <out_path> \
--center True # center the genotype or not, default is False \
--threads 4 \
--memory 20000
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
dask-pgen-0.1.1.tar.gz
(10.8 kB
view hashes)