Skip to main content

Map genetic variants to rsIDs

Project description

variant-mapper

The variant-mapper is a package to map genetic variants map genetic variants to the genome, in order to validate them and assign an rs ID.

This will:

  1. Localise the genetic variant based on chromosome position, using either a file join approach or tabix.
  2. Determine if the ref/alt alleles match to a known variant site, it assumes that ref/alt can be flipped.
  3. If a site can be identified then it will annotate the variant with function information.
  4. If no site can be identified
  5. If an INDEL, normalise the alleles and attempt mapping again.
  6. Finally, is still can't be mapped validate one of the alleles against the reference genome assembly
  7. This can also handle cases where only a single allele is known, assuming the site is bi-alleilic and the ref allele can be localised.

The mapper works by having a common mapper file and a full mapper file. The common mapper file contains common variants usually used in GWAS studies and the full mapper file has all known variant from dbSNP and from other projects as well.

You can either map by localising the genetic variants using tabix or by a table scan (file join) approach. The file join is most efficient if you have millions of variants, or rather if your input fie is ~10-20M variants. In this case the common file is used for the join and where something can't be mapped then a tabix query is tried again the full file. In many cases the common file is good enough but it might miss some variants. In any case, please contact me for a download link. There is nothing super secret about the mapping file, UCL does not offer any file distribution and I have no other official way of distributing it, so it is on my personal pCloud at the moment.

Installation

This can be installed using pypi or conda

To install using pypi:

pip install variant-mapper

To install using conda:

conda install -c cfin -c conda-forge variant-mapper

Documentation

There is online documentation for variant mapper.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

variant_mapper-0.1.0a0.tar.gz (332.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

variant_mapper-0.1.0a0-py3-none-any.whl (368.6 kB view details)

Uploaded Python 3

File details

Details for the file variant_mapper-0.1.0a0.tar.gz.

File metadata

  • Download URL: variant_mapper-0.1.0a0.tar.gz
  • Upload date:
  • Size: 332.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for variant_mapper-0.1.0a0.tar.gz
Algorithm Hash digest
SHA256 0a8672fab101b0a209c68a6f49325696159f7e3326896051fe28011c21df5440
MD5 37be59027c1d1b9dc79730625cc719a5
BLAKE2b-256 5f767d659d644d4cbdc48ca1a8fef4da556ce3629354adda40af2da0aab4a392

See more details on using hashes here.

File details

Details for the file variant_mapper-0.1.0a0-py3-none-any.whl.

File metadata

File hashes

Hashes for variant_mapper-0.1.0a0-py3-none-any.whl
Algorithm Hash digest
SHA256 acf41fa3fdc7d9f5631604690a06e44014d8562325f6162b996fcacd65d99bb9
MD5 66c8807d16e2f6292dd4ab23e05d6855
BLAKE2b-256 34d067e1a5761d0753b2d7b9b2d8ee671a752eb4e114ef239834306bfdc8671c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page