Skip to main content

Utilities for working with PGS Catalog API and scoring files

Project description

PGS Catalog utilities

CI

This repository is a collection of useful tools for working with data from the PGS Catalog. This is mostly used internally by the PGS Catalog calculator, but other users might find some of these tools helpful.

Overview

  • download_scorefiles: Download scoring files by PGS ID (accession) in genome builds GRCh37 or GRCh38
  • combine_scorefile: Combine multiple scoring files into a single scoring file in 'long' format
  • match_variants: Match target variants (bim or pvar files) against the output of combine_scorefile to produce scoring files for plink 2

Installation

$ pip install pgscatalog-utils

Quickstart

$ download_scorefiles -i PGS000922 PGS001229 -o . -b GRCh37
$ combine_scorefiles -s PGS*.txt.gz -o combined.txt 
$ match_variants -s combined.txt -t <example.pvar> --min_overlap 0.75 --outdir .

More details are available using the --help parameter.

Install from source

Requirements:

$ git clone https://github.com/PGScatalog/pgscatalog_utils.git
$ cd pgscatalog_utils
$ poetry install
$ poetry build
$ pip install --user dist/*.whl 

Credits

The pgscatalog_utils package is developed as part of the Polygenic Score (PGS) Catalog (www.PGSCatalog.org) project, a collaboration between the University of Cambridge’s Department of Public Health and Primary Care (Michael Inouye, Samuel Lambert, Laurent Gil) and the European Bioinformatics Institute (Helen Parkinson, Aoife McMahon, Ben Wingfield, Laura Harris).

A manuscript describing the tool and larger PGS Catalog Calculator pipeline (PGSCatalog/pgsc_calc) is in preparation. In the meantime if you use these tools we ask you to cite the repo(s) and the paper describing the PGS Catalog resource:

This work has received funding from EMBL-EBI core funds, the Baker Institute, the University of Cambridge, Health Data Research UK (HDRUK), and the European Union's Horizon 2020 research and innovation programme under grant agreement No 101016775 INTERVENE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pgscatalog_utils-0.1.1.tar.gz (24.6 kB view details)

Uploaded Source

Built Distribution

pgscatalog_utils-0.1.1-py3-none-any.whl (32.1 kB view details)

Uploaded Python 3

File details

Details for the file pgscatalog_utils-0.1.1.tar.gz.

File metadata

  • Download URL: pgscatalog_utils-0.1.1.tar.gz
  • Upload date:
  • Size: 24.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.10.5 Darwin/21.6.0

File hashes

Hashes for pgscatalog_utils-0.1.1.tar.gz
Algorithm Hash digest
SHA256 15e48997ca94c53b80f0abadfb0915b50a2cd657c3ac7d7b39d0760bb48db9fa
MD5 1a254e8552c67526a894a338bd4caea3
BLAKE2b-256 c39d73e09e77f493015136c248d0d0f716591257ab0ec7d8aeff6927a65b5cf8

See more details on using hashes here.

File details

Details for the file pgscatalog_utils-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pgscatalog_utils-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 32.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.10.5 Darwin/21.6.0

File hashes

Hashes for pgscatalog_utils-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 47d561c84831ce9e55d00fb0bc9761f70f26dcc03400537b964aa5eb0dec874d
MD5 1a48ebdc693efc083602833333b2c7a9
BLAKE2b-256 1b5e4ef9656bd67d6801cbd853dfe093398564ea9575f13285a9f2c02cd2bdb1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page