Skip to main content

Map MaveDB scoresets to VRS objects

Project description

dcd-map: Map MaveDB data to computable and interoperable variant objects

image image image Actions status DOI

This library implements a novel method for mapping MaveDB scoreset data to GA4GH Variation Representation Specification (VRS 2.0) objects, enhancing interoperability for genomic medicine applications. See Arbesfeld et. al. (2024) for a preprint edition of the mapping manuscript, or download the resulting mappings directly.

If you make use of this software or the resultant mappings, please cite the manuscript:

Jeremy A. Arbesfeld, Estelle Y. Da, James S. Stevenson, Kori Kuzma, Anika Paul, Tierra Farris, Benjamin J. Capodanno, Sally B. Grindstaff, Kevin Riehle, Nuno Saraiva-Agostinho, Jordan F. Safer, Aleksandar Milosavljevic, Julia Foreman, Helen V. Firth, Sarah E. Hunt, Sumaiya Iqbal, Melissa S. Cline, Alan F. Rubin, Alex H. Wagner. bioRxiv 2023.06.20.545702; doi: https://doi.org/10.1101/2023.06.20.545702

Prerequisites

  • Universal Transcript Archive (UTA): see README for setup instructions. Users with access to Docker on their local devices can use the available Docker image; otherwise, start a relatively recent (version 14+) PostgreSQL instance and add data from the available database dump.
  • SeqRepo: see README for setup instructions. The SeqRepo data directory must be writeable; see specific instructions here for more.
  • Gene Normalizer: see documentation for data setup instructions.
  • blat: Must be available on the local PATH and executable by the user. Otherwise, its location can be set manually with the BLAT_BIN_PATH env var. See the UCSC Genome Browser FAQ for download instructions.

Installation

Install from PyPI:

python3 -m pip install dcd-mapping

Usage

Use the dcd-map command with a scoreset URN, eg

$ dcd-map urn:mavedb:00000083-c-1

Output is saved in the format <URN>_mapping_results_<ISO datetime>.json in the directory specified by the environment variable MAVEDB_STORAGE_DIR, or ~/.local/share/dcd-mapping by default.

Use dcd-map --help to see other available options.

Notebooks

Notebooks for manuscript data analysis and figure generation are provided within notebooks/analysis. See notebooks/analysis/README.md for more information.

Development

Clone the repo

git clone https://github.com/ave-dcd/dcd_mapping
cd dcd_mapping

Create and activate a virtual environment

python3 -m virtualenv venv
source venv/bin/activate

Install as editable and with developer dependencies

python3 -m pip install -e '.[dev,tests]'

Add pre-commit hooks

pre-commit install

Run tests with pytest

pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dcd_mapping-0.3.0.tar.gz (3.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dcd_mapping-0.3.0-py3-none-any.whl (38.6 kB view details)

Uploaded Python 3

File details

Details for the file dcd_mapping-0.3.0.tar.gz.

File metadata

  • Download URL: dcd_mapping-0.3.0.tar.gz
  • Upload date:
  • Size: 3.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dcd_mapping-0.3.0.tar.gz
Algorithm Hash digest
SHA256 7d5fc6a105e794ca0f07c52961d9b63c78dfb9a854343e5625680a4994b0394f
MD5 b60175447e0b1195d410f7fadf0d1815
BLAKE2b-256 058cb296d5f5165ef4cc964444d0ca360fd26378068fe74d7f8190ce8f514d83

See more details on using hashes here.

Provenance

The following attestation bundles were made for dcd_mapping-0.3.0.tar.gz:

Publisher: release.yaml on ave-dcd/dcd_mapping

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dcd_mapping-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: dcd_mapping-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 38.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dcd_mapping-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e785bbb1ca853a97a17ae99beff4076bc2afd09101e222c9648e1a8ef8e7f741
MD5 1060d3daa5b50452ecf3a2eadd2f6486
BLAKE2b-256 ffe2630aaec60da8f1c974b755b1add3cf9ea5a26547ce50a042dc3c50ff9150

See more details on using hashes here.

Provenance

The following attestation bundles were made for dcd_mapping-0.3.0-py3-none-any.whl:

Publisher: release.yaml on ave-dcd/dcd_mapping

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page