This is a sequence processing tool written in Rust for manipulating FASTA/FASTQ files. Pure rust version of seqtk.
Project description
seqtk-rs
This is a sequence processing tool written in Rust for manipulating FASTA/FASTQ files. I built this tool out of my passion for Rust. Its functionality and subcommand names are similar to those in seqtk, but I’ve made some changes based on my own design logic.
Installation
cargo install seqtk-rs
seqtk_rs -h
Current Features
-
seqCommon transformation of FASTA/Q -
sampleRandom Sampling by given seed and fraction -
sizeReport the stats of sequence length(Output: #seq, #bases, avg_size, min_size, med_size, max_size, N50)
-
fqchkReport stats for sequence and quality by position(Output: POS, #bases, %A, %C, %G, %T, %N, avgQ, errQ, ...)
- avgQ: Average quality score
(Q₁ + Q₂ + ... + Qₙ) / N - errQ: Estimated error rate
-10 * log₁₀((P₁ + P₂ + ... + Pₙ) / N)
Notice: Some tools treat quality scores less than 3 (Q < 3) as 3 to avoid instability in downstream metrics. For example, Q = 0 yields an error probability P = 1.0, Q = 1 gives P ≈ 0.794, and Q = 2 gives P ≈ 0.630. These low Q-scores can heavily skew error rate calculations (e.g., errQ), which is why they are often floored to 3. However, this adjustment can lead to results that are inconsistent with the original definition. Therefore, this tool preserves the original quality scores as-is.
- avgQ: Average quality score
-
compReport the nucleotide composition of FASTA/Q(Output: #A, #C, #G, #T, #2, #3, #4, #CG, #GC)
CGorGC: Number of CG/GC on the template strand
-
qctrimTrims low-quality bases from a FASTQ data based on a quality threshold Q.
TODO
-
trimAdaptertrim the adapter for FASTQ file
Acknowledgements
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file seqtk_rs-0.2.0-py3-none-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: seqtk_rs-0.2.0-py3-none-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 613.1 kB
- Tags: Python 3, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
06c81231bf33f4bd4c568500773b8e6ce89e58b382cee2c8e8f93be7a2c7e13c
|
|
| MD5 |
9905cc028e679279bd612827b7de96ce
|
|
| BLAKE2b-256 |
7d06d444869bcc25019e3b959112472001609d56a9821bf9f060e0f1d047c812
|