Skip to main content

Utilities for working with PGS Catalog API and scoring files

Project description

PGS Catalog utilities

CI

This repository is a collection of useful tools for working with data from the PGS Catalog. This is mostly used internally by the PGS Catalog calculator, but other users might find some of these tools helpful.

Overview

  • download_scorefiles: Download scoring files by PGS ID (accession) in genome builds GRCh37 or GRCh38
  • combine_scorefile: Combine multiple scoring files into a single scoring file in 'long' format
  • match_variants: Match target variants (bim or pvar files) against the output of combine_scorefile to produce scoring files for plink 2

Installation

$ pip install pgscatalog-utils

Quickstart

$ download_scorefiles -i PGS000922 PGS001229 -o . -b GRCh37
$ combine_scorefiles -s PGS*.txt.gz -o combined.txt 
$ match_variants -s combined.txt -t <example.pvar> --min_overlap 0.75 --outdir .

More details are available using the --help parameter.

Install from source

Requirements:

$ git clone https://github.com/PGScatalog/pgscatalog_utils.git
$ cd pgscatalog_utils
$ poetry install
$ poetry build
$ pip install --user dist/*.whl 

Credits

The pgscatalog_utils package is developed as part of the Polygenic Score (PGS) Catalog (www.PGSCatalog.org) project, a collaboration between the University of Cambridge’s Department of Public Health and Primary Care (Michael Inouye, Samuel Lambert, Laurent Gil) and the European Bioinformatics Institute (Helen Parkinson, Aoife McMahon, Ben Wingfield, Laura Harris).

A manuscript describing the tool and larger PGS Catalog Calculator pipeline (PGSCatalog/pgsc_calc) is in preparation. In the meantime if you use these tools we ask you to cite the repo(s) and the paper describing the PGS Catalog resource:

This work has received funding from EMBL-EBI core funds, the Baker Institute, the University of Cambridge, Health Data Research UK (HDRUK), and the European Union's Horizon 2020 research and innovation programme under grant agreement No 101016775 INTERVENE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pgscatalog_utils-0.1.2.tar.gz (28.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pgscatalog_utils-0.1.2-py3-none-any.whl (36.5 kB view details)

Uploaded Python 3

File details

Details for the file pgscatalog_utils-0.1.2.tar.gz.

File metadata

  • Download URL: pgscatalog_utils-0.1.2.tar.gz
  • Upload date:
  • Size: 28.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.10.5 Darwin/21.6.0

File hashes

Hashes for pgscatalog_utils-0.1.2.tar.gz
Algorithm Hash digest
SHA256 a19e10a3634caedb2266144086087c89b8f7fbd8951e40c9063f9b8161ffda60
MD5 f07d5cd48f462b5988a82b8263b59e8d
BLAKE2b-256 c9292e8230bfcb11ef924248a9d339782eab1773b173481119c7d5eca3bdb321

See more details on using hashes here.

File details

Details for the file pgscatalog_utils-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: pgscatalog_utils-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 36.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.10.5 Darwin/21.6.0

File hashes

Hashes for pgscatalog_utils-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 2b1b58c9e881d27a1301917152e8de5fd03f3a9fd2b48ac613044c30ff641bd9
MD5 cd7269b1e15fa2550eac09b1f8e28d6d
BLAKE2b-256 f31513a208988c86497b2b932e3bd5332a09393ff505ca956b327aee583f1621

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page