Skip to main content

A command-line tool to extract and filter sequence reads from BAM and FASTQ files by ID, quality score, and length.

Project description

seq-miner

seq-miner is a lightweight tool to extract and filter reads from BAM or FASTQ files based on:

  • Specific read IDs
  • Mean quality score threshold
  • Minimum read length
  • Multi-threading (FASTQ)
  • JSON/CSV-ready summary (optional)

Installation

pip install seq-miner

Or clone from source:

git clone https://github.com/your-org/seq-miner.git
cd seq-miner
pip install .

Usage

Extract reads from BAM

seq-miner -i reads.bam -o filtered.bam -f bam -r read_ids.txt --min-qscore 10 --min-length 200

Filter FASTQ reads in parallel

seq-miner -i reads.fastq -o filtered.fastq -f fastq --min-qscore 15 --min-length 1000 --threads 4

Show version

seq-miner --version

Command-line Options

Option Description
-i, --input Input BAM or FASTQ file
-o, --output Output file for passed reads
-f, --format File format: bam or fastq
-r, --read-ids Optional file with read IDs (one per line)
--min-qscore Minimum mean Q-score (default: 0.0)
--min-length Minimum read length (default: 0)
--threads Number of CPU threads (only used for FASTQ)
--verbose Enable verbose logging
--version Print the current version

Output Summary

When finished, you'll see:

Summary:
Passed reads     : 12345
Low-quality reads : 54
Short reads      : 91

Optionally, you can pipe this to JSON or CSV (coming soon).

Auto Version + Release

  • Version is stored in seqminer/__version__.py
  • Tagged automatically with GitHub Actions on push to main
  • Published to PyPI on GitHub release

License

MIT License © Theerayut
See LICENSE for full text.

Contact

For issues, please open an issue on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seq_miner-1.3.1.tar.gz (5.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

seq_miner-1.3.1-py3-none-any.whl (6.4 kB view details)

Uploaded Python 3

File details

Details for the file seq_miner-1.3.1.tar.gz.

File metadata

  • Download URL: seq_miner-1.3.1.tar.gz
  • Upload date:
  • Size: 5.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for seq_miner-1.3.1.tar.gz
Algorithm Hash digest
SHA256 8ef78d9b0748bea2889d15cf34c5e437be9ad67d04a69326cb5d2819e02a9cf1
MD5 04a48d2d07b458b7d14decc1dd3897c7
BLAKE2b-256 cdf401d17a59f67722034da4a78d3df96c47170f6a649b7cecd46d4c08fa3413

See more details on using hashes here.

Provenance

The following attestation bundles were made for seq_miner-1.3.1.tar.gz:

Publisher: python-publish.yml on aeiwz/seq-miner

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file seq_miner-1.3.1-py3-none-any.whl.

File metadata

  • Download URL: seq_miner-1.3.1-py3-none-any.whl
  • Upload date:
  • Size: 6.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for seq_miner-1.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4d8ef02596427e4b013dedac2c4eafbe292d24ae9f0440e25abe8b0e053090bc
MD5 391e60cd082d4e8674ea36ef626489d8
BLAKE2b-256 d1c4c564e4aa87b6afc025c35b58b12b71be73da34e106523ddfde48c6b45ff1

See more details on using hashes here.

Provenance

The following attestation bundles were made for seq_miner-1.3.1-py3-none-any.whl:

Publisher: python-publish.yml on aeiwz/seq-miner

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page