Skip to main content

An HPV integration sites detection tool for targeted capture sequencing data

Project description

SearcHPV

An HPV integration point detection tool for targeted capture sequencing data

Introdution

  • SearcHPV detects HPV fusion sites on both human genome and HPV genome
  • SearcHPV is able to provide locally assembled contigs for each integration events. It will report at least one and at most two contigs for each integration sites. The two contigs will provide information captured for left and right sides of the event.

Getting started

  1. Required resources
  • Unix like environment
  • Third-party tools:
Python/3.7.3 https://www.python.org/downloads/release/python-373/
samtools/1.5 https://github.com/samtools/samtools/releases/tag/1.5
BWA/0.7.15-r1140 https://github.com/lh3/bwa/releases/tag/v0.7.15
java/1.8.0_252 https://www.oracle.com/java/technologies/javase/8all-relnotes.html
Picard Tools/2.23.8 https://github.com/broadinstitute/picard/releases/tag/2.23.8
PEAR/0.9.2 https://github.com/tseemann/PEAR
CAP3/02/10/15 http://seq.cs.iastate.edu/cap3.html

  1. Download and install Firstly, download and install the required resources. Then, tap these commands in your terminal:
pip install searcHPV

  1. Usage SearcHPV have four main steps. You could either run it start-to-finish or run it step-by-step.
  • Usage:
searcHPV <options> ...
  • Standard options:
 -fastq1 <str>  sequencing data: fastq/fq.gz file
 -fastq2 <str>  sequencing data: fastq/fq.gz file
 -humRef <str>  human reference genome: fasta file
 -virRef <str>  HPV reference genome: fasta file
  • Optional options:
-window <int>   the length of region searching for informative reads, default=300
-output <str>   output directory, default "./"
-align  run the alignment step, step1
-genomeFusion   call the genome fusion points, step2
-assemble local assemble for each integration event, step3
-hpvFusion call the HPV fusion points, step4

  • Examples:
  1. Run it start-to-finish:
searcHPV -fastq1 Sample_81279.R1.fastq.gz -fastq2 Sample_81279.R2.fastq.gz -humRef hs37d5.fa -virRef HPV.fa -output /home/scratch/HPV_fusion/Sample_81279

  1. Run it step-by-step:
searchHPV -align -fastq1 Sample_81279.R1.fastq.gz -fastq2 Sample_81279.R2.fastq.gz -humRef hs37d5.fa -virRef HPV.fa -output /home/scratch/HPV_fusion/Sample_81279
searchHPV -genomeFusion -fastq1 Sample_81279.R1.fastq.gz -fastq2 Sample_81279.R2.fastq.gz -humRef hs37d5.fa -virRef HPV.fa -output /home/scratch/HPV_fusion/Sample_81279
searchHPV -assemble -fastq1 Sample_81279.R1.fastq.gz -fastq2 Sample_81279.R2.fastq.gz -humRef hs37d5.fa -virRef HPV.fa -output /home/scratch/HPV_fusion/Sample_81279
searchHPV -hpvFusion -fastq1 Sample_81279.R1.fastq.gz -fastq2 Sample_81279.R2.fastq.gz -humRef hs37d5.fa -virRef HPV.fa -output /home/scratch/HPV_fusion/Sample_81279

Output

  1. Alignment: the marked dupliaction alignment bam file and customized reference genome.\
  2. Genome Fusion Point Calling: orignal callset, filtered callset, filtered clustered callset.\
  3. Assemble: supportive reads, contigs for each integration events (unfiltered).\
  4. HPV fusion Point Calling: alignment bam file for contigs againt human and HPV genome.\ Final output: summary of all the integration events, contig sequences for all the integration events.

Citation

Contact

wenjingu@umich.edu

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

searcHPV-1.0.0.dev1.tar.gz (19.5 kB view details)

Uploaded Source

Built Distributions

searcHPV-1.0.0.dev1-py3.7.egg (37.9 kB view details)

Uploaded Source

searcHPV-1.0.0.dev1-py3-none-any.whl (20.8 kB view details)

Uploaded Python 3

File details

Details for the file searcHPV-1.0.0.dev1.tar.gz.

File metadata

  • Download URL: searcHPV-1.0.0.dev1.tar.gz
  • Upload date:
  • Size: 19.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3

File hashes

Hashes for searcHPV-1.0.0.dev1.tar.gz
Algorithm Hash digest
SHA256 dc20641a9f73c303a4f01a17342e7097bdc9a1799dbf9e6a2e76925e840f6c76
MD5 a9545e571aed9b6582d5bf878070fb12
BLAKE2b-256 6ef41aaa85118ebb2d016867947c5bb79c137eb1510b1fdd746ce1fff000e22e

See more details on using hashes here.

Provenance

File details

Details for the file searcHPV-1.0.0.dev1-py3.7.egg.

File metadata

  • Download URL: searcHPV-1.0.0.dev1-py3.7.egg
  • Upload date:
  • Size: 37.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3

File hashes

Hashes for searcHPV-1.0.0.dev1-py3.7.egg
Algorithm Hash digest
SHA256 88f76c5d0c185b8913d3312ceb175afb374817943d58b8efdf0255dd11f0cc59
MD5 42bb6a69061ae8d8a099069837bac7b4
BLAKE2b-256 39e4e0300f89a07f67af409c91e1bad38eb0cb02457d3938b5d9e632b05d28b9

See more details on using hashes here.

Provenance

File details

Details for the file searcHPV-1.0.0.dev1-py3-none-any.whl.

File metadata

  • Download URL: searcHPV-1.0.0.dev1-py3-none-any.whl
  • Upload date:
  • Size: 20.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3

File hashes

Hashes for searcHPV-1.0.0.dev1-py3-none-any.whl
Algorithm Hash digest
SHA256 1dffd24c5232e18a532732db1667aad44e8b5a778047c81a43d85cbe775afdd7
MD5 39d5f60abcf4bc855876b2f28cf54d22
BLAKE2b-256 609c781b830ffa6ed01fd2ebb8debc3462f575770148982435db63e614c45e72

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page