Skip to main content

TENNIS is an evolution-based model to predict unannotated isoforms and refine existing transcriptome annotations

Project description

TENNIS 🎾: Transcript EvolutioN for New Isoform Splicing

TENNIS is an evolution-based model to predict unannotated isoforms and refine existing transcriptome annotations without requiring additional data.

Installation

Prerequisites

Installation

The only dependency of TENNIS is PySAT, which can be installed with pip. TENNIS can be installed by directly cloning this repository.

# install PySAT
pip install python-sat[aiger,approxmc,cryptosat,pblib]
# install TENNIS
git clone https://github.com/Shao-Group/TENNIS
cd TENNIS
chmod +x src/tennis

This repository also modified and re-distributes GTF.py codes (retrieved from here) developed by Kamil Slowikowski. Users don't have to re-download it.

Test and Example

# display help message
./src/tennis -h
# run TENNIS on an example dataset
mkdir test
cd test
./src/tennis -o tennis_example ../example/example.gtf 

Usage

If installed with conda or pip, tennis executable should be ready to use in $PATH. If installed manually, the tennis executable is in ./src/ dir.

tennis [options] -o <output_prefix> <gtf_file> 

The program outputs two files: output_prefix.stats and output_prefix.pred.gtf. More about the output format is available here.

Required positional arguments:

gtf_file : str
Input GTF file in standard format containing transcript annotations.

Optional arguments:

-h, --help

-o, --output_prefix : str
Default: "tennis"

-p, --PctIn_threshold : float
A threshold in range [0, 1]. Predicted isoforms with PctIn value lower than this threshold will be filtered out. If -p 0.0, all isoforms are retained.
Default: 0.5

-x, --exclude_group_size : int
Skip analysis of transcript groups that have more isoforms than this threshold.
Default: 100

-m, --max_novel_isoform : int
Maximum number of novel isoforms to predict per transcript group.
Default: 4

--time_out : int
Time limit in seconds for each SAT solver instance.
Default: 900 (15 minutes)

Output Files

output_prefix.stats : Statistical summary. T1 (T2, T3, ...) is the collection of transcript groups that need 1 (2, 3, ..) novel isoforms to satisfy the evolution model.

output_prefix.pred.gtf : ​ GTF format file with predicted novel isoforms.

More about the output format is available here.

Contributing

For bug reports or feature requests, please open an issue on the GitHub repository here.

License & Citation

TENNIS is freely available under BSD 3-Clause License.

Copyright (c) 2024, Xiaofei Carl Zang, Ke Chen, Mingfu Shao, and The Pennsylvania State University.

The preprint of TENNIS is available on bioRxiv here.

@article {TENNIS,
	author = {Zang, Xiaofei Carl and Chen, Ke and Khan, Irtesam Mahmud and Shao, Mingfu},
	title = {Augmenting Transcriptome Annotations through the Lens of Splicing Evolution},
	year = {2024},
	doi = {10.1101/2024.11.04.621892},
	journal = {bioRxiv}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tennis_transcriptome-0.0.1.tar.gz (19.2 kB view details)

Uploaded Source

Built Distribution

tennis_transcriptome-0.0.1-py3-none-any.whl (31.8 kB view details)

Uploaded Python 3

File details

Details for the file tennis_transcriptome-0.0.1.tar.gz.

File metadata

  • Download URL: tennis_transcriptome-0.0.1.tar.gz
  • Upload date:
  • Size: 19.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for tennis_transcriptome-0.0.1.tar.gz
Algorithm Hash digest
SHA256 76b8f9df942904f73da0db584c9404c288d5b445e341dbecd514b466a3d774f3
MD5 777d80695e4d37405d1c026bfc468db4
BLAKE2b-256 142ecab99d9876601e044bff41662d78f095ed79ece768bf518d063a095a5b20

See more details on using hashes here.

File details

Details for the file tennis_transcriptome-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for tennis_transcriptome-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1e059f8b5f85cd0cf0ef82e8ca333cb2d47bd84936cf85fcd9b3751de9dbc293
MD5 d03362822cf32e3ebf46d0ef968e8eb8
BLAKE2b-256 ba86bc6f5f03bc9e853118c34ceb99fd327b4b55fdfabceddfe83b9d12937641

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page