Skip to main content

ProleTRact - Tandem Repeat Visualization Tool

Project description

grafik


This repository contains a Tandem Repeat Visualization Tool that serves as the companion tool to TandemTwister. The tool processes Variant Call Format (VCF) files generated by TandemTwister and visualize tandem repeats in an intuitive, interactive format. Users can explore motifs, compare alleles to the reference sequence, and gain insights into the structure of tandem repeats, enhancing their ability to interpret genomic variation.

Why ProleTRact?

TRs are complex: alleles can differ by motif composition, length, and interrupted blocks. ProleTRact visulize TR regions with color-coded motifs, highlights interruptions, and provides intuitive navigation across regions and samples, enabling quick insight into potentially pathogenic expansions or atypical structures.

Key Features

  • Individual and Cohort modes: Analyze a single VCF or an entire directory of VCFs.
  • Dynamic sequence visualization: Color-coded motifs, clear interruption highlighting, and side-by-side allele comparison.
  • Pathogenic TR reference overlay: Built-in pathogenic_TRs.bed provides context for known loci (disease, gene, thresholds).
  • Fast navigation: Move across TR records with Previous/Next controls or jump to a specific region.

Installation

pip install proletract
proletract  # launches the web application

The launcher starts both the backend API server (port 8502) and frontend web server (port 3000). The application will open in your browser automatically. On headless machines, access the frontend at http://localhost:3000 after starting the application.

Quickstart

  1. Launch the app with the command above: proletract
  2. Open the browser tab to http://localhost:3000 (the URL will be shown in the terminal if you're running headless).
  3. Load an individual VCF or cohort folder from the sidebar and start exploring tandem repeats.

Usage

Individual mode 👤

  1. Select individual sample in the sidebar.
  2. Provide the absolute path to a bgzipped and tabix-indexed VCF (.vcf.gz with .tbi):
    • Enter the path in the sidebar input, then click Load VCF.
    • The app will parse records and enable navigation across TR variants.
  3. Use Previous/Next to step through records or jump to a region like chr1:1000-2000.
  4. Inspect motif blocks, interruptions, and per-allele differences.

Cohort mode 👥👥

Reads-based VCF

  1. Select Cohort in the sidebar and choose Reads-based VCF view.
  2. Provide the absolute path to a directory containing TandemTwister VCF files.
  3. Click Load Cohort to scan the directory and enable cohort navigation.
  4. Browse records and compare across samples.
  5. Use Previous/Next to step through records or jump to a region like chr1:1000-2000.
  6. Inspect motif blocks, interruptions, and per-allele differences.

Assembly VCF

  1. Select Cohort in the sidebar and choose Assembly VCF view.
  2. Provide the absolute path to a directory containing TandemTwister VCF files.
  3. Click Load Cohort to scan the directory and enable cohort navigation.
  4. Browse records and compare across samples.
  5. Use Previous/Next to step through records or jump to a region like chr1:1000-2000.
  6. Inspect motif blocks, interruptions, and per-allele differences.

Input Requirements

  • VCF format: Standard VCF generated by TandemTwister.
  • Cohort directory: A folder with multiple .vcf.gz files generated by TandemTwister is required for cohort mode.

Demo / Examples

Example screenshots and short walkthrough GIFs will be added here. For now, you can open example.svg for a preview:

Tandem Repeat Visualization Example
  • Planned: Individual-mode walkthrough
  • Planned: Cohort-mode walkthrough

Contributing

Contributions are welcome! Please open an issue to discuss changes.

License

This project is licensed under the BSD 3-Clause Non-Commercial License — see LICENSE for details. Commercial use is prohibited. This software is intended for academic research, educational purposes, and personal/private use only. For commercial licensing inquiries, please contact the author.

Citation

If you use ProleTRact in your work, please cite this repository. A formal citation entry will be added once available.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

proletract-1.0.0.tar.gz (594.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

proletract-1.0.0-py3-none-any.whl (1.6 MB view details)

Uploaded Python 3

proletract-1.0.0-py2.py3-none-any.whl (1.6 MB view details)

Uploaded Python 2Python 3

File details

Details for the file proletract-1.0.0.tar.gz.

File metadata

  • Download URL: proletract-1.0.0.tar.gz
  • Upload date:
  • Size: 594.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for proletract-1.0.0.tar.gz
Algorithm Hash digest
SHA256 bfea03ab7cbb6522267fef90e1332f8a05dd4ebecc9270abf468779442f977fd
MD5 90680363b53596255fecf82fce17cde7
BLAKE2b-256 acd06395979d8fa825c74faa2daf44258d4c9724c0ff963123380e876b98f0ba

See more details on using hashes here.

File details

Details for the file proletract-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: proletract-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 1.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for proletract-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 302057b0e05e6e51728d86c94666b8e799ae77fa8c46b8bb97a1760d87f24e52
MD5 84cd611a13761c2cb2b8fff750d3b0a7
BLAKE2b-256 0eeded6b66dfb59c50413f0adfad899f77f216b49a5b63beadd53631023996a0

See more details on using hashes here.

File details

Details for the file proletract-1.0.0-py2.py3-none-any.whl.

File metadata

  • Download URL: proletract-1.0.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 1.6 MB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.13

File hashes

Hashes for proletract-1.0.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 6693bd31c8a14e77314ef4ff124b37a9663c4957e9743677f3cc1ea320212213
MD5 d59260f06fcd50704b387f3771c2308e
BLAKE2b-256 989f8449132ff60409303425856ed05543bb4e26252368814864a9744aa3b2c7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page