Skip to main content

A command-line tool to extract and filter sequence reads from BAM and FASTQ files by ID, quality score, and length.

Project description

seq-miner

seq-miner is a fast, Python-based command-line tool to extract and filter sequence reads from BAM and FASTQ files by:

  • Read ID (single or batch)
  • Mean quality score
  • Minimum read length

Built for researchers working in genomics, transcriptomics, and metagenomics.

Installation

Install via pip:

pip install seq-miner

Usage

seq-miner --input INPUT --output OUTPUT --format FORMAT [options]

Example 1: Filter FASTQ reads by Q-score and length

seq-miner -i sample.fastq -o filtered.fastq -f fastq --min_qscore 15 --min_length 100

Example 2: Extract specific read IDs from a BAM file

seq-miner -i reads.bam -o matched.bam -f bam -r read_ids.txt --min_qscore 10 --min_length 200

Options

Flag Description
-i, --input Input BAM or FASTQ file
-o, --output Output file to write filtered/passed reads
-f, --format File format: bam or fastq
-r, --read_ids File containing read IDs (one per line, optional)
--min_qscore Minimum average quality score per read (default: 0)
--min_length Minimum length per read (default: 0)

Input Examples

FASTQ file (.fastq)

Supports gzipped or plain FASTQ format.

BAM file (.bam)

Requires pysam under the hood.

Read ID file (optional)

read00001
read00044
read20398

Output

  • Filtered reads saved to the specified output file.
  • CLI prints counts of:
    • Passed reads
    • Low-quality reads
    • Short reads

Dependencies

Installable automatically via pip install seq-miner.

Publishing (for maintainers)

To publish:

python -m build
twine upload dist/*

Or use GitHub Actions (see .github/workflows/pypi-release.yml) for trusted publishing.

License

MIT License © Theerayut
See LICENSE for full text.

Contact

For issues, please open an issue on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seq_miner-1.1.0.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

seq_miner-1.1.0-py3-none-any.whl (6.8 kB view details)

Uploaded Python 3

File details

Details for the file seq_miner-1.1.0.tar.gz.

File metadata

  • Download URL: seq_miner-1.1.0.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for seq_miner-1.1.0.tar.gz
Algorithm Hash digest
SHA256 6654c508f21be773cf27b9e9720e7519386eedb455cfaaa55ee627b0fed29622
MD5 ce7d390b2191c914e1912772a901b4fc
BLAKE2b-256 fd3abae7a6a73decf04bc4fd1216add1860c6c5c5c50d3266f4c07ee40ed9d4a

See more details on using hashes here.

Provenance

The following attestation bundles were made for seq_miner-1.1.0.tar.gz:

Publisher: python-publish.yml on aeiwz/seq-miner

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file seq_miner-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: seq_miner-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 6.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for seq_miner-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f3a42fab2afc890f948fb2deabf1381ab9091af5866bc12a76eb6e9f8b2e38eb
MD5 353d49b9a8d86243a7209d9e19760f07
BLAKE2b-256 dc04bf0d8cf369307c8b93cec908955909d42999e7b64412ed54ceeaa13e2880

See more details on using hashes here.

Provenance

The following attestation bundles were made for seq_miner-1.1.0-py3-none-any.whl:

Publisher: python-publish.yml on aeiwz/seq-miner

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page