Skip to main content

A user-friendly platform for interactive exploration, visualization, and analysis of tandem repeat findings from TandemTwister outputs

Project description


grafik


This repository contains a Tandem Repeat Visualization Tool that serves as the companion tool to TandemTwister. The tool processes Variant Call Format (VCF) files generated by TandemTwister and visualize tandem repeats in an intuitive, interactive format. Users can explore motifs, compare alleles to the reference sequence, and gain insights into the structure of tandem repeats, enhancing their ability to interpret genomic variation.

Why ProleTRact?

TRs are complex: alleles can differ by motif composition, length, and interrupted blocks. ProleTRact visulize TR regions with color-coded motifs, highlights interruptions, and provides intuitive navigation across regions and samples, enabling quick insight into potentially pathogenic expansions or atypical structures.

Key Features

  • Individual and Cohort modes: Analyze a single VCF or an entire directory of VCFs.
  • Dynamic sequence visualization: Color-coded motifs, clear interruption highlighting, and side-by-side allele comparison.
  • Pathogenic TR reference overlay: Built-in pathogenic_TRs.bed provides context for known loci (disease, gene, thresholds).
  • Fast navigation: Move across TR records with Previous/Next controls or jump to a specific region.

Installation Options

Pick the workflow that fits your environment:

Option A — Install from PyPI (recommended)

pip install proleTRact
proleTRact  # launches the Streamlit app

The launcher opens a browser locally. On headless machines set STREAMLIT_SERVER_HEADLESS=true before invoking proleTRact.

Option B — Clone and run locally (with conda)

git clone git@github.com:Lionward/ProleTRact.git
cd ProleTRact
conda create -n proletract python=3.9
conda activate proletract
pip install -r requirements.txt
pip install -e .
streamlit run src/proletract/app.py

Quickstart

  1. Launch the app with one of the commands above.
  2. Open the browser tab (Streamlit prints the URL if you are headless).
  3. Load an individual VCF or cohort folder from the sidebar and start exploring tandem repeats.

Usage

Individual mode 👤

  1. Select individual sample in the sidebar.
  2. Provide the absolute path to a bgzipped and tabix-indexed VCF (.vcf.gz with .tbi):
    • Enter the path in the sidebar input, then click Upload VCF File.
    • The app will parse records and enable navigation across TR variants.
  3. Use Previous/Next to step through records or jump to a region like chr1:1000-2000.
  4. Inspect motif blocks, interruptions, and per-allele differences.

Cohort mode 👥👥

  1. Select Cohort in the sidebar and choose Reads-based VCF or Assembly VCF view.
  2. Provide the absolute path to a directory containing TandemTwister VCF files:
  3. Click Load Cohort to scan the directory and enable cohort navigation.
  4. Browse records and compare across samples.
  5. Use Previous/Next to step through records or jump to a region like chr1:1000-2000.
  6. Inspect motif blocks, interruptions, and per-allele differences.

Input Requirements

  • VCF format: Standard VCF generated by TandemTwister.
  • Cohort directory: A folder with multiple .vcf.gz files generated by TandemTwister is required for cohort mode.

Demo / Examples

Example screenshots and short walkthrough GIFs will be added here. For now, you can open example.svg for a preview:

Tandem Repeat Visualization Example
  • Planned: Individual-mode walkthrough
  • Planned: Cohort-mode walkthrough

Contributing

Contributions are welcome! Please open an issue to discuss changes.

License

This project is licensed under the BSD 3-Clause Non-Commercial License — see LICENSE for details. Commercial use is prohibited. This software is intended for academic research, educational purposes, and personal/private use only. For commercial licensing inquiries, please contact the author.

Citation

If you use ProleTRact in your work, please cite this repository. A formal citation entry will be added once available.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

proletract-0.2.0.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

proletract-0.2.0-cp310-cp310-manylinux1_x86_64.manylinux_2_5_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.5+ x86-64

File details

Details for the file proletract-0.2.0.tar.gz.

File metadata

  • Download URL: proletract-0.2.0.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.13

File hashes

Hashes for proletract-0.2.0.tar.gz
Algorithm Hash digest
SHA256 963793c45c6828fe3476c034e9c30786fc367392a4208b4b07485954be562641
MD5 c800e6fd9ba30d9600cfcf38047e4dbb
BLAKE2b-256 5255538c798d44eddba87df42d54a8b5463a94a8fd256233a87e8548f44a7613

See more details on using hashes here.

File details

Details for the file proletract-0.2.0-cp310-cp310-manylinux1_x86_64.manylinux_2_5_x86_64.whl.

File metadata

File hashes

Hashes for proletract-0.2.0-cp310-cp310-manylinux1_x86_64.manylinux_2_5_x86_64.whl
Algorithm Hash digest
SHA256 e30c450bfa342a2435c525363f72519196e2c4825a63bf9b8454819c4db0f6d7
MD5 bd2996111ba5a8b78c9b1d5177b8e3ef
BLAKE2b-256 a412e9fa141d868b0b33969e1d080a7c7d9b56f015d68137c3c0279adcc7887d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page