FASTQ utilities
Project description
heyfastq
FASTQ sequence file utilities, written in pure Python, with no dependencies.
Summary
The package comes with one program, heyfastq, which provides
utilities for single or paired FASTQ files.
Installation
Install from PyPi with:
pip install heyfastq
Or get the dev version from GitHub:
git clone https://github.com/kylebittinger/heyfastq.git
pip install .
Usage
Run heyfastq -h to learn more about usage options.
Dev
Heyfastq is built around the idea of piping reads (or read pairs) through filter and map functions. The fundamental unit that moves through heyfastq pipelines is the R object, which can be either a Read or a ReadPair. These generic Rs move through functions that take in ReadPipes and output ReadPipes, allowing for easy composition of pipelines.
from heyfastqlib.read import Read, ReadPair, R, ReadPipe
from heyfastqlib.pipelines import filter_reads, map_reads
def unit_filter(r: R) -> bool:
return True
def unit_map(r: R) -> R:
return r
input_fastq = (r for r in [Read("1", "ACTG", "HHHH"), Read("2", "GTCA", "HHHH"), Read("3", "AAAA", "####")])
filter_counter = {"input_reads": 0, "input_bases": 0, "output_reads": 0, "output_bases": 0}
map_counter = {"input_reads": 0, "input_bases": 0, "output_reads": 0, "output_bases": 0}
output_fastq = map_reads(filter_reads(input_fastq, unit_filter, filter_counter), unit_map, map_counter)
This is all well and good, but how do we actually deal with fastq files, not just objects already in python?
from heyfastqlib.io import parse_fastq, write_fastq
with open("r1.fq") as f_in, open("o1.fq", "w") as f_out:
write_fastq(f_out, parse_fastq(f_in))
with open("r1.fq") as f1_in, open("r2.fq") as f2_in, open("o1.fq", "w") as f1_out, open("o2.fq", "w") as f2_out:
write_fastq((f1_out, f2_out), parse_fastq((f1_in, f2_in)))
Putting it all together, assuming we've imported/created the objects from above, let's read our fastqs then apply our filter then our map and finally write them to fastqs again:
with open("r1.fq") as f1_in, open("r2.fq") as f2_in, open("o1.fq", "w") as f1_out, open("o2.fq", "w") as f2_out:
write_fastq((f1_out, f2_out), map_reads(filter_reads(parse_fastq((f1_in, f2_in)), unit_filter, filter_counter), unit_map, map_counter))
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file heyfastq-0.2.3.tar.gz.
File metadata
- Download URL: heyfastq-0.2.3.tar.gz
- Upload date:
- Size: 20.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8d52a1633a074164fd62a56f139534ea92201042525dfa73f86fe7a84aa26647
|
|
| MD5 |
d963dbfd5272227f11e094217d336b74
|
|
| BLAKE2b-256 |
16aadcafe70acdee34b1b72c9a1985671023d301021429a4e35ba48511fcf235
|
Provenance
The following attestation bundles were made for heyfastq-0.2.3.tar.gz:
Publisher:
release.yml on PennChopMicrobiomeProgram/heyfastq
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
heyfastq-0.2.3.tar.gz -
Subject digest:
8d52a1633a074164fd62a56f139534ea92201042525dfa73f86fe7a84aa26647 - Sigstore transparency entry: 920084309
- Sigstore integration time:
-
Permalink:
PennChopMicrobiomeProgram/heyfastq@801b76903d472a8f8e73fa5928ce66b215d8c6f7 -
Branch / Tag:
refs/tags/0.2.3 - Owner: https://github.com/PennChopMicrobiomeProgram
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@801b76903d472a8f8e73fa5928ce66b215d8c6f7 -
Trigger Event:
release
-
Statement type:
File details
Details for the file heyfastq-0.2.3-py3-none-any.whl.
File metadata
- Download URL: heyfastq-0.2.3-py3-none-any.whl
- Upload date:
- Size: 18.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c4b0819ec591e8788e78ed287b59ca4edcb013dd3d18e3acc22275e1daf2fcd3
|
|
| MD5 |
969bf38a6f1a816ceb70d8e0ae2d5d30
|
|
| BLAKE2b-256 |
4a3ba632895779b796bca337f8add7ed691c1ce51ce7184218e1f294fb23b909
|
Provenance
The following attestation bundles were made for heyfastq-0.2.3-py3-none-any.whl:
Publisher:
release.yml on PennChopMicrobiomeProgram/heyfastq
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
heyfastq-0.2.3-py3-none-any.whl -
Subject digest:
c4b0819ec591e8788e78ed287b59ca4edcb013dd3d18e3acc22275e1daf2fcd3 - Sigstore transparency entry: 920084311
- Sigstore integration time:
-
Permalink:
PennChopMicrobiomeProgram/heyfastq@801b76903d472a8f8e73fa5928ce66b215d8c6f7 -
Branch / Tag:
refs/tags/0.2.3 - Owner: https://github.com/PennChopMicrobiomeProgram
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@801b76903d472a8f8e73fa5928ce66b215d8c6f7 -
Trigger Event:
release
-
Statement type: