HIFI-SE
Project description
HIFI-barcode-SE400
The BGISEQ-500 platform has launched a new test sequencing kits capable of single-end 400 bp sequencing (SE400), which offers a simple and reliable way to achieve DNA barcodes efficiently. In this study, we explored the potential of the BGISEQ-500 SE400 sequencing in DNA barcode reference construction, meanwhile provided an updated HIFI-Barcode software package that can generate COI barcode assemblies using HTS reads of length > 400 bp.
Versions
new release: 1.0 2018/11/13
Usage (latest)
HIFI-SE
usage: HIFI-SE [-h] {all,filter,assign,assembly,bold_identification} ...
Description
An automatic pipeline for HIFI-SE400 project, including filtering raw reads,
assigning reads to samples, assembly HIFI barcodes (COI sequences).
Version
1.0 2018-11-3
Author
yangchentao at genomics.cn, BGI.
mengguanliang at genomics.cn, BGI.
positional arguments:
{all,filter,assign,assembly,bold_identification}
all run filter, assign and assembly
filter filter raw reads
assign assign reads to samples
assembly do assembly from input fastq reads,
output HIFI barcodes.
bold_identification
do taxa identification on BOLD system,
optional arguments:
-h, --help show this help message and exit
run in "all"
Example:
HIFI-SE all -outpre hifi -raw test.raw.fastq -index 5 -primer index_primer.list -cid 0.98 -oid 0.95 -seqs_lim 50000 -threads 4 -tp 2
run by steps [filter -> assign -> assembly]
python3 HIFI-SE.py filter
usage: HIFI-SE filter [-h] -outpre <STR> -raw <STR> [-e <INT>]
[-q <INT> <INT>] [-n <INT>]
optional arguments:
-h, --help show this help message and exit
common arguments:
-outpre <STR> outprefix for process.
filter arguments:
-raw <STR> input raw singled-end fastq file, (Phred33)
-e <INT> expected error number threshod, P = 10–Q/10, default=10
-q <INT> <INT> filter by quality method, Q = –10 log10(P),
filter out low quality reads. example: 20 5, it means
dropping read which contains more than 5 percent of
quality score < 20 bases.
-n <INT> remove reads containing [INT] Ns, default=1
python3 HIFI-SE.py assign
usage: HIFI-SE assign [-h] -outpre <STR> -index INT -fq <STR> -primer <STR>
[-outdir <STR>]
optional arguments:
-h, --help show this help message and exit
common arguments:
-outpre <STR> outprefix for process.
index arguments:
-index INT index sequence lenght
when only run assign arguments:
-fq <STR> cleaned fastq file
assign arguments:
-primer <STR> taged primer list, like following lines:
Rev001 AAGCTAAACTTCAGGGTGACCAAAAAATCA
For001 AAGCGGTCAACAAATCATAAAGATATTGG
...
this format is necessary!
-outdir <STR> output directory for assignment
python3 HIFI-SE.py assembly
usage: HIFI-SE assembly [-h] -outpre <STR> -index INT -list FILE
[-vsearch <STR>] [-threads <INT>] [-cid FLOAT]
[-min INT] [-max INT] [-oid FLOAT] [-tp INT] [-ab INT]
[-seqs_lim INT] [-len INT] [-mode INT] [-rc] [-cc]
[-codon INT] [-frame INT]
optional arguments:
-h, --help show this help message and exit
common arguments:
-outpre <STR> outprefix for process.
index arguments:
-index INT index sequence lenght
when only run assembly arguments:
-list FILE input file, fastq file list. [required]
software path:
-vsearch <STR> vsearch path(only needed if vsearch is not in $PATH)
-threads <INT> threads for vsearch
-cid FLOAT identity for clustering [0.98]
assembly arguments:
-min INT minimun length of overlap [80]
-max INT maximum length of overlap [90]
-oid FLOAT minimun identity of overlap region [0.95]
-tp INT how many clusters using in assembly. default=2
-ab INT keep all clusters to assembly if its abundance >=INT
-seqs_lim INT reads number limitation. [0]
-len INT standard reads length [400]
-mode INT modle 1 is to cluster and keep most [-tp] abundance clusters,
or clusters abundance more than [-ab], and then make a consensus
sequence for each cluster. modle 2 is directly to make only
consensus sequence without clustering.
-rc whether to check amino acid translation for reads
-cc whether to check final COI contig's amino acid translation
-codon INT codon table using to check translation [5],
by the way, table [4,5] have same effect for COI gene.
-frame INT translation start shift [1]
Github page
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
HIFI-SE-0.0.1.tar.gz
(28.2 kB
view details)
File details
Details for the file HIFI-SE-0.0.1.tar.gz
.
File metadata
- Download URL: HIFI-SE-0.0.1.tar.gz
- Upload date:
- Size: 28.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.18.4 setuptools/40.6.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5f36dcb0c2d758c85da9320f871d3d44014a8e1dc9064d4fb428fe0c6a86f42d |
|
MD5 | f7cec17fde02a8a8d7fa91afebe51043 |
|
BLAKE2b-256 | 29c6212a87647dadd51dce5a2841f0d894bee5ab29929c336b0913248c9c3e24 |