Read PLINK files into Pandas data frames
Pandas-plink is a Python package for reading PLINK binary file format. The file reading is taken place via lazy loading, meaning that it saves up memory by actually reading only the genotypes that are actually accessed by the user.
We recommend installing it via conda:
conda install -c conda-forge pandas-plink
The above method is preferable because it does not require building tools, which makes the installation less prone to errors.
Alternatively, pandas-plink can also be installed using pip:
pip install pandas-plink
The above method will perform some compilation and the installation is very likely to be successful, as we test every release under Windows, Linux, and macOS platforms. If by any change it fails, please, consider submitting a new issue.
It is as simple as
from pandas_plink import read_plink (bim, fam, G) = read_plink('/path/to/files_prefix')
for which files_prefix.bed, files_prefix.bim, and files_prefix.fam contain the data. Portions of the genotype will be read as the user access them. Please, refer to the documentation for more information.
This project is licensed under the MIT License.