Map MaveDB scoresets to VRS objects
Project description
dcd-map: Map MaveDB data to computable and interoperable variant objects
This library implements a novel method for mapping MaveDB scoreset data to GA4GH Variation Representation Specification (VRS 2.0) objects, enhancing interoperability for genomic medicine applications. See Arbesfeld et. al. (2024) for a preprint edition of the mapping manuscript, or download the resulting mappings directly.
If you make use of this software or the resultant mappings, please cite the manuscript:
Jeremy A. Arbesfeld, Estelle Y. Da, James S. Stevenson, Kori Kuzma, Anika Paul, Tierra Farris, Benjamin J. Capodanno, Sally B. Grindstaff, Kevin Riehle, Nuno Saraiva-Agostinho, Jordan F. Safer, Aleksandar Milosavljevic, Julia Foreman, Helen V. Firth, Sarah E. Hunt, Sumaiya Iqbal, Melissa S. Cline, Alan F. Rubin, Alex H. Wagner. bioRxiv 2023.06.20.545702; doi: https://doi.org/10.1101/2023.06.20.545702
Prerequisites
- Universal Transcript Archive (UTA): see README for setup instructions. Users with access to Docker on their local devices can use the available Docker image; otherwise, start a relatively recent (version 14+) PostgreSQL instance and add data from the available database dump.
- SeqRepo: see README for setup instructions. The SeqRepo data directory must be writeable; see specific instructions here for more.
- Gene Normalizer: see documentation for data setup instructions.
- blat: Must be available on the local PATH and executable by the user. Otherwise, its location can be set manually with the
BLAT_BIN_PATHenv var. See the UCSC Genome Browser FAQ for download instructions.
Installation
Install from PyPI:
python3 -m pip install dcd-mapping
Usage
Use the dcd-map command with a scoreset URN, eg
$ dcd-map urn:mavedb:00000083-c-1
Output is saved in the format <URN>_mapping_results_<ISO datetime>.json in the directory specified by the environment variable MAVEDB_STORAGE_DIR, or ~/.local/share/dcd-mapping by default.
Use dcd-map --help to see other available options.
Notebooks
Notebooks for manuscript data analysis and figure generation are provided within notebooks/analysis. See notebooks/analysis/README.md for more information.
Development
Clone the repo
git clone https://github.com/ave-dcd/dcd_mapping
cd dcd_mapping
Create and activate a virtual environment
python3 -m virtualenv venv
source venv/bin/activate
Install as editable and with developer dependencies
python3 -m pip install -e '.[dev,tests]'
Add pre-commit hooks
pre-commit install
Run tests with pytest
pytest
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dcd_mapping-0.3.0.tar.gz.
File metadata
- Download URL: dcd_mapping-0.3.0.tar.gz
- Upload date:
- Size: 3.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7d5fc6a105e794ca0f07c52961d9b63c78dfb9a854343e5625680a4994b0394f
|
|
| MD5 |
b60175447e0b1195d410f7fadf0d1815
|
|
| BLAKE2b-256 |
058cb296d5f5165ef4cc964444d0ca360fd26378068fe74d7f8190ce8f514d83
|
Provenance
The following attestation bundles were made for dcd_mapping-0.3.0.tar.gz:
Publisher:
release.yaml on ave-dcd/dcd_mapping
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dcd_mapping-0.3.0.tar.gz -
Subject digest:
7d5fc6a105e794ca0f07c52961d9b63c78dfb9a854343e5625680a4994b0394f - Sigstore transparency entry: 177427962
- Sigstore integration time:
-
Permalink:
ave-dcd/dcd_mapping@007fc5cf2805814c9ca2b3b7d3eefe72d631777c -
Branch / Tag:
refs/tags/0.3.0 - Owner: https://github.com/ave-dcd
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@007fc5cf2805814c9ca2b3b7d3eefe72d631777c -
Trigger Event:
release
-
Statement type:
File details
Details for the file dcd_mapping-0.3.0-py3-none-any.whl.
File metadata
- Download URL: dcd_mapping-0.3.0-py3-none-any.whl
- Upload date:
- Size: 38.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e785bbb1ca853a97a17ae99beff4076bc2afd09101e222c9648e1a8ef8e7f741
|
|
| MD5 |
1060d3daa5b50452ecf3a2eadd2f6486
|
|
| BLAKE2b-256 |
ffe2630aaec60da8f1c974b755b1add3cf9ea5a26547ce50a042dc3c50ff9150
|
Provenance
The following attestation bundles were made for dcd_mapping-0.3.0-py3-none-any.whl:
Publisher:
release.yaml on ave-dcd/dcd_mapping
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dcd_mapping-0.3.0-py3-none-any.whl -
Subject digest:
e785bbb1ca853a97a17ae99beff4076bc2afd09101e222c9648e1a8ef8e7f741 - Sigstore transparency entry: 177427968
- Sigstore integration time:
-
Permalink:
ave-dcd/dcd_mapping@007fc5cf2805814c9ca2b3b7d3eefe72d631777c -
Branch / Tag:
refs/tags/0.3.0 - Owner: https://github.com/ave-dcd
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@007fc5cf2805814c9ca2b3b7d3eefe72d631777c -
Trigger Event:
release
-
Statement type: