Skip to main content

Refines approximate repeat regions identified by ExpansionHunter denovo to exact genomic coordinates

Project description

EHdnExact

GitHub release (latest by date) GitHub contributors GitHub last commit GitHub issues License

Description

EHdnExact is a command-line tool that refines the genomic regions of repeat expansions identified by ExpansionHunter Denovo (EHdn). EHdn provides approximate locations of potential repeat expansions across the genome. EHdnExact leverages the output from the EHdn locus TSV file, performs local sequence alignment, and pinpoints the exact boundaries of these expansions. The results are formatted into a TSV file and can also be converted into an ExpansionHunter variant catalog.

Installation

pip install ehdnexact

Usage

ehdnexact [options] <loci> <reference> <output_prefix>

Required Arguments

  • loci: Path to the EHdn locus TSV file. This file should have the first four columns labeled as contig, start, end, and motif.
  • reference: Path to the reference FASTA file.
  • output_prefix: Prefix for the output files. Results are stored in TSV format, with an optional JSON output (see -c flag). View example output.

Options

  • -e ERROR_MARGIN, --error-margin ERROR_MARGIN: Define the error margin for regions identified by EHdn, default is 1000 base pairs.
  • -m MIN_REPEATS, --min-repeats MIN_REPEATS: Minimum number of repeat units required to report a region, default is 2 units.
  • -r, --ref-seq: Include the reference sequence in the output file.
  • -c, --eh-variant-catalog: Generate an ExpansionHunter variant catalog as well.

Example

Run EHdnExact with custom settings:

ehdnexact -e 1500 -m 1 -r -c example/dataset.locus.tsv path/to/reference.fa example/exact.locus.tsv

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

EHdnExact-0.1.1.tar.gz (5.0 kB view hashes)

Uploaded Source

Built Distribution

EHdnExact-0.1.1-py3-none-any.whl (6.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page