Skip to main content

Project (ancient) human genomes onto pre-computed standard PCA

Project description

projectPCA

Project genomes onto pre-computed principal components widely used in ancient DNA. Enables fast analysis without re-computing the principal components. The software accepts ancient DNA data in eigenstrat or PLINK format as input. No modern samples are required, as the packages include the pre-computed PCA weights and PC coordinates for relevant modern samples (based on publicly available Human Origin array data).

Installation

The package projectPCAis available as a Python package via pip. To install, simply run a version of:

python3 -m pip install projectPCA

List of available PCAs

As of early 2026, two pre-computed PCAs are officially bundled into projectPCA. The bracket denotes the code you can use for all this PCA.

  • HO Westeurasia (HO) Standard Western Eurasian PCA, which is widely used in aDNA studies. PC1 corresponds to West-East, and PC2 to North-South.

  • HO Eurasian (EUAS) Standard whole-Eurasian PCA, widely used in aDNA studies. Excellent to resolve West versus East Asian ancestry (on PC1). PC2 generally corresponds to North-South.

Usage

Project single Samples

To project onto a PCA, the key function is project_eigenstrat. To import it and run a single sample, use:

from projectPCA.run import project_eigenstrat

project_eigenstrat(es_path="/mnt/archgen/Autorun_eager/eager_outputs/TF/SUA/SUA002/genotyping/pileupcaller.double",
                   pca="HO", es_type="default")

This function also returns the dataframe with PCA coordinates. Note that the input path is the path of the eigenstrat files up to .geno but without the suffix.

The keyword pca denotes which PCA type to project onto (see above).

If you want to save the figure, you can add the keyword fig_path="". If this string is filled in, the program saves the resulting figure there. If the path ends in .html, the figure is saved as an interactive plot, where you can hover over the individuals to see their labels (both ancient and modern reference samples). Otherwise, the standard matplotlib libraries are used to plot and save the figure (including in .png or .pdf format, based on the extension you provide).

project_eigenstrat(es_path="/mnt/archgen/Autorun_eager/eager_outputs/TF/SUA/SUA002/genotyping/pileupcaller.double",
                   pca="EUAS", es_type="unpacked_fast", plot_bgrd_c=False, fig_path='./figs/SUA002_EUAS.html')

Project multiple samples

It is also possible to project multiple samples. For this, you can use the keyword iids=[]. If the keyword is empty (the default), all samples in a file are projected and plotted. If you specify a list of individuals, only individuals with these IDs are projected.

Project PLINK files

To project PLINK files, you can use the keyword es_type="plink", and provide the path of the PLINK file up to the suffix:

project_eigenstrat(es_path="/mnt/archgen/users/hringbauer/git/EPIDEMIC/output/plink/bd_ptn_335",
                   pca="EUAS", es_type="plink", iids=[],
                   plot_bgrd_c=False, verbose=True, flip=True, 
                   fig_path='/mnt/archgen/users/hringbauer/git/projectPCA/figs/ptn335PLINK_EUAS.html')

@Harald Ringbauer, 2026

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

projectpca-0.3.tar.gz (60.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

projectpca-0.3-py3-none-any.whl (61.4 MB view details)

Uploaded Python 3

File details

Details for the file projectpca-0.3.tar.gz.

File metadata

  • Download URL: projectpca-0.3.tar.gz
  • Upload date:
  • Size: 60.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for projectpca-0.3.tar.gz
Algorithm Hash digest
SHA256 837dfcd1d368df3ab360f0fff8cdc3f14c0115c84a06cdafd2595a3e39a7cb68
MD5 a20e6fd00ffd80116d66992ca0d91be6
BLAKE2b-256 471f5aca97018c70ecf1d8d237f55b9f77e7f2afa3bc2f0bbb9924d57b7e6774

See more details on using hashes here.

File details

Details for the file projectpca-0.3-py3-none-any.whl.

File metadata

  • Download URL: projectpca-0.3-py3-none-any.whl
  • Upload date:
  • Size: 61.4 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for projectpca-0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 e9ccfbeac2c159466d427309be461d077d47e86e58c3ad29f55e432a6f9ff4f1
MD5 b20551372f93dfda06a497078b5230f2
BLAKE2b-256 8d741e3ff42bad79e671e0ef84d3a4670ca8f89346e39bc8f45d4b289b92a114

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page