Skip to main content

A command-line tool to extract and filter sequence reads from BAM and FASTQ files by ID, quality score, and length.

Project description

seq-miner

seq-miner is a lightweight tool to extract and filter reads from BAM or FASTQ files based on:

  • Specific read IDs
  • Mean quality score threshold
  • Minimum read length
  • Multi-threading (FASTQ)
  • JSON/CSV-ready summary (optional)
  • GitHub release tagging and PyPI publish automation

Installation

pip install seq-miner

Or clone from source:

git clone https://github.com/your-org/seq-miner.git
cd seq-miner
pip install .

Usage

Extract reads from BAM

seq-miner -i reads.bam -o filtered.bam -f bam -r read_ids.txt --min-qscore 10 --min-length 200

Filter FASTQ reads in parallel

seq-miner -i reads.fastq -o filtered.fastq -f fastq --min-qscore 15 --min-length 1000 --threads 4

Show version

seq-miner --version

Command-line Options

Option Description
-i, --input Input BAM or FASTQ file
-o, --output Output file for passed reads
-f, --format File format: bam or fastq
-r, --read-ids Optional file with read IDs (one per line)
--min-qscore Minimum mean Q-score (default: 0.0)
--min-length Minimum read length (default: 0)
--threads Number of CPU threads (only used for FASTQ)
--verbose Enable verbose logging
--version Print the current version

Output Summary

When finished, you'll see:

Summary:
Passed reads     : 12345
Low-quality reads : 54
Short reads      : 91

Optionally, you can pipe this to JSON or CSV (coming soon).

Auto Version + Release

  • Version is stored in seqminer/__version__.py
  • Tagged automatically with GitHub Actions on push to main
  • Published to PyPI on GitHub release

License

MIT License © Theerayut
See LICENSE for full text.

Contact

For issues, please open an issue on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seq_miner-1.3.0.tar.gz (5.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

seq_miner-1.3.0-py3-none-any.whl (6.4 kB view details)

Uploaded Python 3

File details

Details for the file seq_miner-1.3.0.tar.gz.

File metadata

  • Download URL: seq_miner-1.3.0.tar.gz
  • Upload date:
  • Size: 5.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for seq_miner-1.3.0.tar.gz
Algorithm Hash digest
SHA256 e90695bd92dfa9d62b08bb9b54f26344eceef181777c7ff1b06d0ebbb70c0fdc
MD5 e562d24b9d65f1498870aa749221b2fc
BLAKE2b-256 7b79558db9c8b89a81e2b50665d999cbee03c5335a476c17a490294d58e977b5

See more details on using hashes here.

Provenance

The following attestation bundles were made for seq_miner-1.3.0.tar.gz:

Publisher: python-publish.yml on aeiwz/seq-miner

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file seq_miner-1.3.0-py3-none-any.whl.

File metadata

  • Download URL: seq_miner-1.3.0-py3-none-any.whl
  • Upload date:
  • Size: 6.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for seq_miner-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0aa78f26c33e129023b0c4126960e779a118437c4f64bea8a2eadc1dea22f663
MD5 2f5759b9e413e537922e07e0f34d09ed
BLAKE2b-256 001b44d4ba286144f2df939bc331e839492c21466cd818fc4b547645dc44bd55

See more details on using hashes here.

Provenance

The following attestation bundles were made for seq_miner-1.3.0-py3-none-any.whl:

Publisher: python-publish.yml on aeiwz/seq-miner

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page