Map MaveDB scoresets to VRS objects
Project description
dcd-map: Map MaveDB data to computable and interoperable variant objects
This library implements a novel method for mapping MaveDB scoreset data to GA4GH Variation Representation Specification (VRS) objects, enhancing interoperability for genomic medicine applications. See Arbesfeld et. al. (2023) for a preprint edition of the mapping manuscript, or download the resulting mappings directly.
Installation
Install from PyPI:
python3 -m pip install dcd-mapping
Also ensure the following data dependencies are available:
- Universal Transcript Archive (UTA): see README for setup instructions. Users with access to Docker on their local devices can use the available Docker image; otherwise, start a relatively recent (version 14+) PostgreSQL instance and add data from the available database dump.
- SeqRepo: see README for setup instructions. The SeqRepo data directory must be writeable; see specific instructions here for more.
- Gene Normalizer: see documentation for data setup instructions.
- blat: Must be available on the local PATH and executable by the user. Otherwise, its location can be set manually with the
BLAT_BIN_PATH
env var. See the UCSC Genome Browser FAQ for download instructions. For our experiments, we placed the binary in the same directory as these notebooks.
Usage
Use the dcd-map
command with a scoreset URN, eg
$ dcd-map urn:mavedb:00000083-c-1
Output is saved in the format <URN>_mapping_results_<ISO datetime>.json
in the directory specified by the environment variable MAVEDB_STORAGE_DIR
, or ~/.local/share/dcd-mapping
by default.
Notebooks
Notebooks for manuscript data analysis and figure generation are provided within notebooks/analysis
. See notebooks/analysis/README.md
for more information.
Development
Clone the repo
git clone https://github.com/ave-dcd/dcd_mapping
cd dcd_mapping
Create and activate a virtual environment
python3 -m virtualenv venv
source venv/bin/activate
Install as editable and with developer dependencies
python3 -m pip install -e '.[dev,tests]'
Add pre-commit hooks
pre-commit install
Run tests with pytest
pytest
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for dcd_mapping-0.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8d341de86fbed5a8855fd896a635b2276a9ba552da3e40c9c664b43607237a73 |
|
MD5 | 203c1f61bc2bc0a71e153ef33150b375 |
|
BLAKE2b-256 | e71d23f1b8b0f046d529a0e0c1d358a59efc8db3872d5aa76a69ab15fde5941a |