Skip to main content

An HPV integration sites detection tool for targeted capture sequencing data

Project description

SearcHPV

An HPV integration point detection tool for targeted capture sequencing data

Introdution

  • SearcHPV detects HPV fusion sites on both human genome and HPV genome
  • SearcHPV is able to provide locally assembled contigs for each integration events. It will report at least one and at most two contigs for each integration sites. The two contigs will provide information captured for left and right sides of the event.

Getting started

  1. Required resources
  • Unix like environment
  • Third-party tools:
Python/3.7.3 https://www.python.org/downloads/release/python-373/
samtools/1.5 https://github.com/samtools/samtools/releases/tag/1.5
BWA/0.7.15-r1140 https://github.com/lh3/bwa/releases/tag/v0.7.15
java/1.8.0_252 https://www.oracle.com/java/technologies/javase/8all-relnotes.html
Picard Tools/2.23.8 https://github.com/broadinstitute/picard/releases/tag/2.23.8
PEAR/0.9.2 https://github.com/tseemann/PEAR
CAP3/02/10/15 http://seq.cs.iastate.edu/cap3.html

After intalling these tools, please make sure that their path have been added to you ".bashrc" script so that you can use them by typing the tool names in the terminal.

  1. Download and install Firstly, download and install the required resources. Then, tap these commands in your terminal:
pip install searcHPV

  1. Usage SearcHPV have four main steps. You could either run it start-to-finish or run it step-by-step.
  • Usage:
searcHPV <options> ...
  • Standard options:
 -fastq1 <str>  sequencing data: fastq/fq.gz file
 -fastq2 <str>  sequencing data: fastq/fq.gz file
 -humRef <str>  human reference genome: fasta file
 -virRef <str>  HPV reference genome: fasta file
  • Optional options:
-h, --help      show this help message and exit
-window <int>   the length of region searching for informative reads, default=300
-output <str>   output directory, default "./"
-alignment      run the alignment step, step1
-genomeFusion   call the genome fusion points, step2
-assemble local assemble for each integration event, step3
-hpvFusion call the HPV fusion points, step4

  • Examples:
  1. Run it start-to-finish:
searcHPV -fastq1 Sample_81279.R1.fastq.gz -fastq2 Sample_81279.R2.fastq.gz -humRef hs37d5.fa -virRef HPV.fa -output /home/scratch/HPV_fusion/Sample_81279

  1. Run it step-by-step:
searchHPV -align -fastq1 Sample_81279.R1.fastq.gz -fastq2 Sample_81279.R2.fastq.gz -humRef hs37d5.fa -virRef HPV.fa -output /home/scratch/HPV_fusion/Sample_81279
searchHPV -genomeFusion -fastq1 Sample_81279.R1.fastq.gz -fastq2 Sample_81279.R2.fastq.gz -humRef hs37d5.fa -virRef HPV.fa -output /home/scratch/HPV_fusion/Sample_81279
searchHPV -assemble -fastq1 Sample_81279.R1.fastq.gz -fastq2 Sample_81279.R2.fastq.gz -humRef hs37d5.fa -virRef HPV.fa -output /home/scratch/HPV_fusion/Sample_81279
searchHPV -hpvFusion -fastq1 Sample_81279.R1.fastq.gz -fastq2 Sample_81279.R2.fastq.gz -humRef hs37d5.fa -virRef HPV.fa -output /home/scratch/HPV_fusion/Sample_81279

Note: if run it step-by-step, please make sure the output directories for all steps are the same.

Output

  1. Alignment: the marked dupliaction alignment bam file and customized reference genome.\
  2. Genome Fusion Point Calling: orignal callset, filtered callset, filtered clustered callset.\
  3. Assemble: supportive reads, contigs for each integration events (unfiltered).\
  4. HPV fusion Point Calling: alignment bam file for contigs againt human and HPV genome.\ Final outputs are under the folder "call_fusion_virus": summary of all the integration events : "HPVfusionPointContig.txt" contig sequences for all the integration events: "ContigsSequence.fa"

Citation

SearcHPV: a novel approach to identify and assemble human papillomavirus-host genomic integration events in cancer --- In progress

Contact

wenjingu@umich.edu

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

searcHPV-1.0.3.tar.gz (20.9 kB view details)

Uploaded Source

Built Distributions

searcHPV-1.0.3-py3.8.egg (40.6 kB view details)

Uploaded Source

searcHPV-1.0.3-py3.7.egg (40.3 kB view details)

Uploaded Source

searcHPV-1.0.3-py3-none-any.whl (22.0 kB view details)

Uploaded Python 3

File details

Details for the file searcHPV-1.0.3.tar.gz.

File metadata

  • Download URL: searcHPV-1.0.3.tar.gz
  • Upload date:
  • Size: 20.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3

File hashes

Hashes for searcHPV-1.0.3.tar.gz
Algorithm Hash digest
SHA256 cd79c28bf16408ee32c606357728f488a89a91cff5523d4912140e6890a6f0b0
MD5 979bb682af6368f3f3d819fe7237c218
BLAKE2b-256 6be1d13bdf6e83c5c820dc687ff2e26b84bab7aa191d3f0a6eeb19558d60740c

See more details on using hashes here.

Provenance

File details

Details for the file searcHPV-1.0.3-py3.8.egg.

File metadata

  • Download URL: searcHPV-1.0.3-py3.8.egg
  • Upload date:
  • Size: 40.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/3.10.0 pkginfo/1.8.2 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.8

File hashes

Hashes for searcHPV-1.0.3-py3.8.egg
Algorithm Hash digest
SHA256 bb5546149e758622eb40ec595fa7d4ba7c5a220095b497722fccfc57a3b59c6f
MD5 b9c7a36f7eefbae27d0d4b5aa4d845be
BLAKE2b-256 db078372daee7580fb5eb787d3bcd5728c0613e25b8ab6ca50a7f8b82ab6da45

See more details on using hashes here.

Provenance

File details

Details for the file searcHPV-1.0.3-py3.7.egg.

File metadata

  • Download URL: searcHPV-1.0.3-py3.7.egg
  • Upload date:
  • Size: 40.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3

File hashes

Hashes for searcHPV-1.0.3-py3.7.egg
Algorithm Hash digest
SHA256 2a2d4ea332b1eebb3c59403196664b3edb3fa44a5c7a1d805c2fa7b6a88215fd
MD5 66e40fa8216eabf548750289a34e1548
BLAKE2b-256 90989117a0688eb173a63f04c7ffb5d5ca91657adf0abf7fb631fe2893334309

See more details on using hashes here.

Provenance

File details

Details for the file searcHPV-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: searcHPV-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 22.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3

File hashes

Hashes for searcHPV-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 8313d1a9958f5303bb87235bc992be35bbc7dcc348982c7ea0322c4d347b9b96
MD5 05b1c2297c12b0600f223113c776f835
BLAKE2b-256 d06bfe8b8a783cdcb93c6385d4524fa433a24cb232aecefabe0d0d0d56bd37fb

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page