Skip to main content

Re-implementation of lostruct in Python

Project description

lostruct-py

This is a reimplementation of lostruct from the original code: Lostruct. Please cite the original paper

Demonstration / How to use

Please see the Example Notebook

Citing

Original Lostruct Paper

Please cite the original lostruct paper:

Li, Han, and Peter Ralph. "Local PCA shows how the effect of population structure differs along the genome." Genetics 211.1 (2019): 289-304.

CyVCF2

This paper also uses cyvcf2 for fast VCF processing and should be cited:

Brent S Pedersen, Aaron R Quinlan, cyvcf2: fast, flexible variant analysis with Python, Bioinformatics, Volume 33, Issue 12, 15 June 2017, Pages 1867–1869, https://doi.org/10.1093/bioinformatics/btx057

Requirements

Python >= 3.6 (may work with older versions)

  • numba
  • numpy
  • pandas
  • scipy
  • skbio
  • sklearn
  • cyvcf2

CyVCF2 requires zlib-dev, libbz2-dev, libcurl-dev, liblzma-dev, and probably others

Easiest to install all of these through conda

Correlation Data

Used Medicago HapMap sister taxa chromsoome 1, processed, and run with LoStruct

Data

bcftools annotate chr1-filtered-set-2014Apr15.bcf -x INFO,FORMAT | bcftools view -a -i 'F_MISSING<=0.2' | bcftools view -q 0.05 -q 0.95 -m2 -M2 -a -Oz -o chr1-filtered.vcf.gz

Lostruct Processing

Rscript run_lostruct.R -t SNP -s 95 -k 10 -m 10 -i data/

This generates the mds_coords.tsv that is used in the correlation comparison.

FAQ / Notes

PCA, MDS, PCoA

PCoA returns the same results as lostruct's MDS implementation (cmdscale). In the example Jupyter notebook you can see the correlation is R =~ 0.998. Some examples of other methods of clustering / looking at differences are included in the notebook.

Casting complex values to real discards the imaginary part

This is fine.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

lostruct_py-0.0.1-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file lostruct_py-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: lostruct_py-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 7.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.3

File hashes

Hashes for lostruct_py-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8d43f5ddcb7b19b97e7f4d211431be62a50d116cecac6f8535102a6bbff69adf
MD5 9ce4494bc9dc183b855d2615d6dc87dc
BLAKE2b-256 cbe921448e74b538a976c780f4faeaaa3d1e6761309d769a43e5f62332f018f2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page