ProleTRact - Tandem Repeat Visualization Tool
Project description
This repository contains a Tandem Repeat Visualization Tool that serves as the companion tool to TandemTwister. The tool processes Variant Call Format (VCF) files generated by TandemTwister and visualize tandem repeats in an intuitive, interactive format. Users can explore motifs, compare alleles to the reference sequence, and gain insights into the structure of tandem repeats, enhancing their ability to interpret genomic variation.
Why ProleTRact?
TRs are complex: alleles can differ by motif composition, length, and interrupted blocks. ProleTRact visulize TR regions with color-coded motifs, highlights interruptions, and provides intuitive navigation across regions and samples, enabling quick insight into potentially pathogenic expansions or atypical structures.
Key Features
- Individual and Cohort modes: Analyze a single VCF or an entire directory of VCFs.
- Dynamic sequence visualization: Color-coded motifs, clear interruption highlighting, and side-by-side allele comparison.
- Pathogenic TR reference overlay: Built-in
pathogenic_TRs.bedprovides context for known loci (disease, gene, thresholds). - Fast navigation: Move across TR records with Previous/Next controls or jump to a specific region.
Installation
Requirements: Python 3.9, 3.10, 3.11, or 3.12 (Python 3.13+ may require building dependencies from source)
Install from PyPI:
pip install proletract
proletract # launches the web application
The launcher starts both the backend API server (port 8502) and frontend web server (port 3000). The application will open in your browser automatically. On headless machines, access the frontend at http://localhost:3000 after starting the application.
Note: If you encounter build errors (e.g., with Python 3.13+), ensure you're using Python 3.9–3.12, or install system dependencies: liblzma-dev (Ubuntu/Debian) or xz-devel (RHEL/CentOS/Fedora).
Quickstart
- Launch the app with the command above:
proletract - Open the browser tab to
http://localhost:3000(the URL will be shown in the terminal if you're running headless). - Load an individual VCF or cohort folder from the sidebar and start exploring tandem repeats.
Usage
Individual mode 👤
- Select individual sample in the sidebar.
- Provide the absolute path to a bgzipped and tabix-indexed VCF (
.vcf.gzwith.tbi):- Enter the path in the sidebar input, then click Load VCF.
- The app will parse records and enable navigation across TR variants.
- Use Previous/Next to step through records or jump to a region like
chr1:1000-2000. - Inspect motif blocks, interruptions, and per-allele differences.
Cohort mode 👥👥
Reads-based VCF
- Select Cohort in the sidebar and choose Reads-based VCF view.
- Provide the absolute path to a directory containing TandemTwister VCF files.
- Click Load Cohort to scan the directory and enable cohort navigation.
- Browse records and compare across samples.
- Use Previous/Next to step through records or jump to a region like
chr1:1000-2000. - Inspect motif blocks, interruptions, and per-allele differences.
Assembly VCF
- Select Cohort in the sidebar and choose Assembly VCF view.
- Provide the absolute path to a directory containing TandemTwister VCF files.
- Click Load Cohort to scan the directory and enable cohort navigation.
- Browse records and compare across samples.
- Use Previous/Next to step through records or jump to a region like
chr1:1000-2000. - Inspect motif blocks, interruptions, and per-allele differences.
Input Requirements
- VCF format: Standard VCF generated by TandemTwister.
- Cohort directory: A folder with multiple
.vcf.gzfiles generated by TandemTwister is required for cohort mode.
Demo / Examples
Example screenshots and short walkthrough GIFs will be added here. For now, you can open example.svg for a preview:
- Planned: Individual-mode walkthrough
- Planned: Cohort-mode walkthrough
Contributing
Contributions are welcome! Please open an issue to discuss changes.
License
This project is licensed under the BSD 3-Clause Non-Commercial License — see LICENSE for details. Commercial use is prohibited. This software is intended for academic research, educational purposes, and personal/private use only. For commercial licensing inquiries, please contact the author.
Citation
If you use ProleTRact in your work, please cite this repository. A formal citation entry will be added once available.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file proletract-1.0.1.tar.gz.
File metadata
- Download URL: proletract-1.0.1.tar.gz
- Upload date:
- Size: 598.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
75fd4683eaeef1f9ed230c9cc28cbbbdf6fc3081f8c39ab87c5e3c3cdc7335ed
|
|
| MD5 |
285da7bb590bafa6818e6d17d254880b
|
|
| BLAKE2b-256 |
088eda41a5f8401175deaa403f7475bcfbbc7809aa33a65c49d39f83c97f6de1
|
File details
Details for the file proletract-1.0.1-py3-none-any.whl.
File metadata
- Download URL: proletract-1.0.1-py3-none-any.whl
- Upload date:
- Size: 1.6 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
88a950224a5dda2d5845aae276c55c9c8c71d8c50b7cb485c101eafd8823a192
|
|
| MD5 |
64643b6b07611c99926eabd4304d565d
|
|
| BLAKE2b-256 |
7ec5bc798654066f789cc061a3bef82e89c770b28ef245f261eab14bbfa1b871
|