Python tools for extracting highly confident fusion transcripts from the results of several RNA-seq alignment tools.

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Intended Audience
- Science/Research
License
- OSI Approved :: GNU General Public License v3 (GPLv3)
Operating System
- Unix
Programming Language
Topic
- Scientific/Engineering :: Bio-Informatics

Project description

fusionfusion

Introduction

fusionfusion is a software for detecting gene fusion using the putative chimeric transcript generated by several well-known transcriptome alignment tools (STAR, MapSplice2 and TopHat2). Many of those predicted chimeric transcripts are "false positives". However, by performing effective filtering, sensitive and accurate gene fusion detection is possible. After the alignment steps, the software can generate final gene fusion candidates and integrating our software into the pipeline will come very easily to you!

Dependency

Python

Python (>= 2.7), pysam (>= 0.8.1)and annot_utils packages.

Software

blat

Install

First, download the latest release from the release section or type the following command

wget https://github.com/Genomon-Project/fusionfusion/archive/v0.3.0.tar.gz
tar zxvf v0.3.0.tar.gz

Alternatively, you can download the latest developing version (which may be unstable)

git clone https://github.com/Genomon-Project/fusionfusion.git

Then, install the package by standard python package protocol (https://docs.python.org/2/install/)

cd fusionfusion-0.3.0
python setup.py build
python setup.py install

For the last command, you may need to add --user if you are using a shared computing cluster.

python setup.py install --user

Preparation

First, you need to perform transcriptome sequencing alignemnt by STAR, MapSplice2, TopHat2.

For STAR, our software uses the chimeric sam file

{output_prefix}.Chimeric.out.sam

For MapSplice2, our software uses the read alignment file

alignments.sam (bam)

You do not need to care about the sorting status.

For TopHat2, our software uses the read alignment file

accepted_hits.bam

Commands

fusionfusion [-h] [--version] [--star star.Chimeric.out.sam]
                  [--ms2 ms2.bam] [--th2 th2.bam] --out output_dir
                  --reference_genome reference.fa [--grc]
                  [--genome_id {hg19,hg38,mm10}]
                  [--pooled_control_file POOLED_CONTROL_FILE] [--debug]
                  [--debug] [--abnormal_insert_size ABNORMAL_INSERT_SIZE]
                  [--min_major_clipping_size MIN_MAJOR_CLIPPING_SIZE]
                  [--min_read_pair_num MIN_READ_PAIR_NUM]
                  [--min_valid_read_pair_ratio MIN_VALID_READ_PAIR_RATIO]
                  [--min_cover_size MIN_COVER_SIZE]
                  [--anchor_size_thres ANCHOR_SIZE_THRES]
                  [--min_chimeric_size MIN_CHIMERIC_SIZE]
                  [--min_allowed_contig_match_diff MIN_ALLOWED_CONTIG_MATCH_DIFF]
                  [--check_contig_size_other_breakpoint CHECK_CONTIG_SIZE_OTHER_BREAKPOINT]
                  [--filter_same_gene]

At least one of --star, --ms2, --th2 arguments should be specified. The arguments of --out and --reference_genome are mandatory. Set the genome model by --genome_id (default is hg19). Currently, we support hg19, hg38 and mm10. Also, if you are using GRC-based files (no "chr" in chromosome names), set --grc. For other arguments, please type fusionfusion -h. Although we believe default settings are fine for 100bp-length paired read data., tuning min_cover_size may help improve the accuracy. Also, using pooled control files generated by the merge_control command of chimera_utils will greatly reduce false positives.

Results

For the result generated by single tool (star.fusion.result.txt, ms2.fusion.result.txt and th2.fusion.result.txt):

chromosome for the 1st breakpoint
coordinate for the 1st breakpoint
direction of the 1st breakpoint
chromosome for the 2nd breakpoint
coordinate for the 2nd breakpoint
direction of the 2nd breakpoint
inserted nucleotides within the breakpoints
#read_pairs supporting the fusion
gene overlapping the 1st breakpoint
exon-intron junction overlapping the 1st breakpoint
gene overlapping the 2nd breakpoint
exon-intron junction overlapping the 2nd breakpoint
contig match score for the 1st breakpoint
contig size for the 1st breakpoint
contig match score for the 2nd breakpoint
conting size for the 2nd breakpoint

For the merged result (fusionfusion.result.txt):

chromosome for the 1st breakpoint
coordinate for the 1st breakpoint
direction of the 1st breakpoint
chromosome for the 2nd breakpoint
coordinate for the 2nd breakpoint
direction of the 2nd breakpoint
inserted nucleotides within the breakpoints
gene overlapping the 1st breakpoint
exon-intron junction overlapping the 1st breakpoint
gene overlapping the 2nd breakpoint
exon-intron junction overlapping the 2nd breakpoint
#read_pairs supporting the variant (by MapSplice2 if --ms2 is specified)
#read_pairs supporting the variant (by STAR if --star is specified)
#read_pairs supporting the variant (by TopHat2 if --th2 is specified)

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Intended Audience
- Science/Research
License
- OSI Approved :: GNU General Public License v3 (GPLv3)
Operating System
- Unix
Programming Language
Topic
- Scientific/Engineering :: Bio-Informatics

Release history Release notifications | RSS feed

This version

0.5.0

May 18, 2019

0.5.0rc1 pre-release

May 18, 2019

0.5.0b1 pre-release

May 17, 2019

0.4.0

Jan 28, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fusionfusion-0.5.0.tar.gz (20.4 kB view details)

Uploaded May 18, 2019 Source

File details

Details for the file fusionfusion-0.5.0.tar.gz.

File metadata

Download URL: fusionfusion-0.5.0.tar.gz
Upload date: May 18, 2019
Size: 20.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/2.7.16

File hashes

Hashes for fusionfusion-0.5.0.tar.gz
Algorithm	Hash digest
SHA256	`663a3ddc183c770397bb19f0173b59dc7c9c98fd2a6d44ca872d1641ddfe5d8e`
MD5	`3212176f9961e299351f0a3c43a685ed`
BLAKE2b-256	`00a435cd00d1aa09085bd9b6b5dbe740eed9ca460b2e1fa1eb5b50bfb9f69e31`

See more details on using hashes here.

fusionfusion 0.5.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

fusionfusion

Introduction

Dependency

Python

Software

Install

Preparation

Commands

Results

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes