Fast and memory-efficient sampling of DNA-Seq or RNA-seq fastq data with or without replacement.

These details have not been verified by PyPI

Project links

Homepage

Project description

fastQpick

Fast and memory-efficient sampling of DNA-seq or RNA-seq FASTQ data with replacement. Useful for generating bootstrap replicates to estimate technical variance in downstream analyses, and for subsampling large datasets for testing and benchmarking.

Installation

Install via PyPI

pip install fastQpick

Install from Source Code

Using pip:

pip install git+https://github.com/pachterlab/fastQpick.git

Usage

Command-line Interface

Run fastQpick with a specified fraction and options:

fastQpick [OPTIONS] FASTQ_FILE1 FASTQ_FILE2 ...

Python API

Use fastQpick in your Python code:

from fastQpick import fastQpick

fastQpick(
    input_files=['FASTQ_FILE1', 'FASTQ_FILE2', ...],
    ...
)

Documentation

Command-line Help: Use the following command to see all available options:
```
fastQpick --help
```
Python API Help: Use the help function to explore the API:
```
help(fastQpick)
```

Tutorials

Two Jupyter notebooks in notebooks/ walk through fastQpick end-to-end:

intro.ipynb — Getting started on synthetic data. Simulates a small RNA-seq experiment with known transcript abundances, draws bootstrap replicates with replacement (fraction=1.0, replacement=True), and shows that the bootstrap standard errors recover the analytic multinomial sampling error.
yeast_example.ipynb — Real-data application reproducing Figure 1 of the paper. Bootstraps a paired-end yeast RNA-seq dataset (SRA SRR453566), re-quantifies each replicate with kallisto, and characterizes the bootstrap distribution of the transcript abundance estimates.

Features

Efficient sampling of large FASTQ files.
Memory efficient - the occurrence vector is sized to the largest per-read count actually drawn (one byte per read in the common case), and low-memory mode further avoids materializing the array of sampled indices.
Time efficient - streams through the fastq and writes output in batches - generates a full-size (fraction=1, with replacement) bootstrap replicate of a 500M-read FASTQ in ~26 minutes in standard mode, ~56 minutes in low-memory mode, and ~35 minutes in one-pass mode (see Benchmark below).
Gzip-compressed output by default, using the ISA-L-accelerated isal library to keep compression from bottlenecking the write pass. Pass --disable-gzip (CLI) or disable_gzip=True (Python API) to write plain FASTQ instead.

License

fastQpick is licensed under the 2-clause BSD license. See the LICENSE file for details.

Contributing

We welcome contributions! Please see the CONTRIBUTING.md file for guidelines on how to get involved.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.3.0

Jun 21, 2026

0.2.0

Jun 20, 2026

0.1.1

Jun 17, 2026

0.1.0

Jan 22, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastqpick-0.3.0.tar.gz (20.5 kB view details)

Uploaded Jun 21, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fastqpick-0.3.0-py3-none-any.whl (15.7 kB view details)

Uploaded Jun 21, 2026 Python 3

File details

Details for the file fastqpick-0.3.0.tar.gz.

File metadata

Download URL: fastqpick-0.3.0.tar.gz
Upload date: Jun 21, 2026
Size: 20.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for fastqpick-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`5113ed607f72a62d1a6ea33d095f48236db7c3b0a27424263ef6aaaa81d3ee5a`
MD5	`08c6249ad65d10f26d327162f2252ee2`
BLAKE2b-256	`c5c18649518878406843c1955d64b8e2ec1e49e756780106a6c40ab2098f7f41`

See more details on using hashes here.

File details

Details for the file fastqpick-0.3.0-py3-none-any.whl.

File metadata

Download URL: fastqpick-0.3.0-py3-none-any.whl
Upload date: Jun 21, 2026
Size: 15.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for fastqpick-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`dc93c239fd48250f12347df733e3a693011099a19c24fc4e16e48503056ea82e`
MD5	`3b7477867b95f057b01ae0fca6c57082`
BLAKE2b-256	`2fe62e21a38447bfa930eb2b758a9fbf9763e3c053bdaaa80ae36c8ccdde6df9`

See more details on using hashes here.

fastQpick 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

fastQpick

Installation

Install via PyPI

Install from Source Code

Usage

Command-line Interface

Python API

Documentation

Tutorials

Features

License

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes