An HPV integration sites detection tool for targeted capture sequencing data
Project description
Host | Downloads |
---|---|
PyPI |
SearcHPV
An HPV integration point detection tool for targeted capture sequencing data
Introdution
- SearcHPV detects HPV fusion sites on both human genome and HPV genome
- SearcHPV is able to provide locally assembled contigs for each integration events. It will report at least one and at most two contigs for each integration sites. The two contigs will provide information captured for left and right sides of the event.
Getting started
- Required resources
- Unix like environment
- Third-party tools:
Python/3.7.3 https://www.python.org/downloads/release/python-373/
samtools/1.5 https://github.com/samtools/samtools/releases/tag/1.5
BWA/0.7.15-r1140 https://github.com/lh3/bwa/releases/tag/v0.7.15
java/1.8.0_252 https://www.oracle.com/java/technologies/javase/8all-relnotes.html
Picard Tools/2.23.8 https://github.com/broadinstitute/picard/releases/tag/2.23.8
PEAR/0.9.2 https://github.com/tseemann/PEAR
CAP3/02/10/15 http://seq.cs.iastate.edu/cap3.html
After intalling these tools, please make sure that their path have been added to you ".bashrc" script so that you can use them by typing the tool names in the terminal.
- Download and install Firstly, download and install the required resources. Then, tap these commands in your terminal:
pip install searcHPV
- Usage SearcHPV have four main steps. You could either run it start-to-finish or run it step-by-step.
- Usage:
searcHPV <options> ...
- Standard options:
-fastq1 <str> sequencing data: fastq/fq.gz file
-fastq2 <str> sequencing data: fastq/fq.gz file
-humRef <str> human reference genome: fasta file
-virRef <str> HPV reference genome: fasta file
- Optional options:
-h, --help show this help message and exit
-window <int> the length of region searching for informative reads, default=300
-output <str> output directory, default "./"
-alignment run the alignment step, step1
-genomeFusion call the genome fusion points, step2
-assemble local assemble for each integration event, step3
-hpvFusion call the HPV fusion points, step4
- Examples:
- Run it start-to-finish:
searcHPV -fastq1 Sample_81279.R1.fastq.gz -fastq2 Sample_81279.R2.fastq.gz -humRef hs37d5.fa -virRef HPV.fa -output /home/scratch/HPV_fusion/Sample_81279
- Run it step-by-step:
searchHPV -align -fastq1 Sample_81279.R1.fastq.gz -fastq2 Sample_81279.R2.fastq.gz -humRef hs37d5.fa -virRef HPV.fa -output /home/scratch/HPV_fusion/Sample_81279
searchHPV -genomeFusion -fastq1 Sample_81279.R1.fastq.gz -fastq2 Sample_81279.R2.fastq.gz -humRef hs37d5.fa -virRef HPV.fa -output /home/scratch/HPV_fusion/Sample_81279
searchHPV -assemble -fastq1 Sample_81279.R1.fastq.gz -fastq2 Sample_81279.R2.fastq.gz -humRef hs37d5.fa -virRef HPV.fa -output /home/scratch/HPV_fusion/Sample_81279
searchHPV -hpvFusion -fastq1 Sample_81279.R1.fastq.gz -fastq2 Sample_81279.R2.fastq.gz -humRef hs37d5.fa -virRef HPV.fa -output /home/scratch/HPV_fusion/Sample_81279
Note: if run it step-by-step, please make sure the output directories for all steps are the same.
Output
- Alignment: the marked dupliaction alignment bam file and customized reference genome.\
- Genome Fusion Point Calling: orignal callset, filtered callset, filtered clustered callset.\
- Assemble: supportive reads, contigs for each integration events (unfiltered).\
- HPV fusion Point Calling: alignment bam file for contigs againt human and HPV genome.\ Final outputs are under the folder "call_fusion_virus": summary of all the integration events : "HPVfusionPointContig.txt" contig sequences for all the integration events: "ContigsSequence.fa"
Citation
SearcHPV: a novel approach to identify and assemble human papillomavirus-host genomic integration events in cancer --- Accepted by Cancer
Contact
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
File details
Details for the file searcHPV-1.0.9.tar.gz
.
File metadata
- Download URL: searcHPV-1.0.9.tar.gz
- Upload date:
- Size: 22.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/3.10.0 pkginfo/1.8.2 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 656e97946af527b7431b1dab064a1fbbabed25b50ff3e3a92135a8f8cf303dd9 |
|
MD5 | 77d32b5904e19c4596a71787174a9b5e |
|
BLAKE2b-256 | 53f32198df9fedb85d62da7943085b59f0b28b61aae13c239ecba1a2d72fffe0 |
Provenance
File details
Details for the file searcHPV-1.0.9-py3.8.egg
.
File metadata
- Download URL: searcHPV-1.0.9-py3.8.egg
- Upload date:
- Size: 40.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/3.10.0 pkginfo/1.8.2 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f343268244f6a3f976258605a3387002cf1fc4580128e2830271c4ec0e6d3159 |
|
MD5 | fba2cf39ae9977c74458f2e0d007c888 |
|
BLAKE2b-256 | c58d4eadcd1e1c01ba3ffcd1d718bb1544063ea01885f8256936a6d3e8b849af |
Provenance
File details
Details for the file searcHPV-1.0.9-py3-none-any.whl
.
File metadata
- Download URL: searcHPV-1.0.9-py3-none-any.whl
- Upload date:
- Size: 22.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/3.10.0 pkginfo/1.8.2 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9aebcddc2590b7c7935812fa1f04a9aaf6caf464b621b6bd48c9026619eddfed |
|
MD5 | 6774071104caefa95b1aa45da1265725 |
|
BLAKE2b-256 | 1d432ce4845b2065d77a90dea0ea3a71bd5dfec115720b65d10550887bd01c47 |