Skip to main content

Refines approximate repeat regions identified by ExpansionHunter denovo to exact genomic coordinates

Project description

EHdnExact

GitHub release (latest by date) GitHub contributors GitHub last commit GitHub issues License

Description

EHdnExact is a command-line tool that refines the genomic regions of repeat expansions identified by ExpansionHunter Denovo (EHdn). EHdn provides approximate locations of potential repeat expansions across the genome. EHdnExact leverages the output from the EHdn locus TSV file, performs local sequence alignment, and pinpoints the exact boundaries of these expansions. The results are formatted into a TSV file and can also be converted into an ExpansionHunter variant catalog.

Installation

pip install ehdnexact

Usage

ehdnexact [options] <loci> <reference> <output_prefix>

Required Arguments

  • loci: Path to the EHdn locus TSV file. This file should have the first four columns labeled as contig, start, end, and motif.
  • reference: Path to the reference FASTA file.
  • output_prefix: Prefix for the output files. Results are stored in TSV format, with an optional JSON output (see -c flag). View example output.

Options

  • -e ERROR_MARGIN, --error-margin ERROR_MARGIN: Define the error margin for regions identified by EHdn, default is 1000 base pairs.
  • -m MIN_REPEATS, --min-repeats MIN_REPEATS: Minimum number of repeat units required to report a region, default is 2 units.
  • -r, --ref-seq: Include the reference sequence in the output file.
  • -c, --eh-variant-catalog: Generate an ExpansionHunter variant catalog as well.

Example

Run EHdnExact with custom settings:

ehdnexact -e 1500 -m 1 -r -c example/dataset.locus.tsv path/to/reference.fa example/exact.locus.tsv

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

EHdnExact-0.1.1.tar.gz (5.0 kB view details)

Uploaded Source

Built Distribution

EHdnExact-0.1.1-py3-none-any.whl (6.2 kB view details)

Uploaded Python 3

File details

Details for the file EHdnExact-0.1.1.tar.gz.

File metadata

  • Download URL: EHdnExact-0.1.1.tar.gz
  • Upload date:
  • Size: 5.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.6

File hashes

Hashes for EHdnExact-0.1.1.tar.gz
Algorithm Hash digest
SHA256 9c9d527c5a9a42fc198fc3ece5543fb88db881f48565f1d0c00918b0b7babbd2
MD5 329e8c8f2e19ce250f06b4b64cbfadde
BLAKE2b-256 bbb798359306903061bd82f51f1ffac76053eb86a7413f7c520f998973af0ba8

See more details on using hashes here.

File details

Details for the file EHdnExact-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: EHdnExact-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 6.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.6

File hashes

Hashes for EHdnExact-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 bfb0fc254340b51a6ddf36dcfad025229ffeb81a67d67f14d313bf15f665658f
MD5 ad4f1d63894dc8c509cd19c99d944fae
BLAKE2b-256 1497b43ee43174c215f97ad15553a2ce52586b9c488d9a4fe1a6ffbccf377205

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page