Skip to main content

A command-line tool to extract and filter sequence reads from BAM and FASTQ files by ID, quality score, and length.

Project description

seq-miner

seq-miner is a fast, Python-based command-line tool to extract and filter sequence reads from BAM and FASTQ files by:

  • Read ID (single or batch)
  • Mean quality score
  • Minimum read length

Built for researchers working in genomics, transcriptomics, and metagenomics.


Installation

Install via pip:

pip install seq-miner

Usage

seq-miner --input INPUT --output OUTPUT --format FORMAT [options]

Example 1: Filter FASTQ reads by Q-score and length

seq-miner -i sample.fastq -o filtered.fastq -f fastq --min_qscore 15 --min_length 100

Example 2: Extract specific read IDs from a BAM file

seq-miner -i reads.bam -o matched.bam -f bam -r read_ids.txt --min_qscore 10 --min_length 200

Options

Flag Description
-i, --input Input BAM or FASTQ file
-o, --output Output file to write filtered/passed reads
-f, --format File format: bam or fastq
-r, --read_ids File containing read IDs (one per line, optional)
--min_qscore Minimum average quality score per read (default: 0)
--min_length Minimum length per read (default: 0)

Input Examples

FASTQ file (.fastq)

Supports gzipped or plain FASTQ format.

BAM file (.bam)

Requires pysam under the hood.

Read ID file (optional)

read00001
read00044
read20398

Output

  • Filtered reads saved to the specified output file.
  • CLI prints counts of:
    • Passed reads
    • Low-quality reads
    • Short reads

Dependencies

Installable automatically via pip install seq-miner.


Publishing (for maintainers)

To publish:

python -m build
twine upload dist/*

Or use GitHub Actions (see .github/workflows/pypi-release.yml) for trusted publishing.


License

MIT License © Theerayut
See LICENSE for full text.


Contact

For issues, please open an issue on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seq_miner-0.1.0.tar.gz (4.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

seq_miner-0.1.0-py3-none-any.whl (5.5 kB view details)

Uploaded Python 3

File details

Details for the file seq_miner-0.1.0.tar.gz.

File metadata

  • Download URL: seq_miner-0.1.0.tar.gz
  • Upload date:
  • Size: 4.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for seq_miner-0.1.0.tar.gz
Algorithm Hash digest
SHA256 9efa4981d314678a1a83a0c28d93d0ed0a335914234fda6865e08d86e55169d9
MD5 7411c5dce00a35a231e3fdd91edb2b6d
BLAKE2b-256 4a32bacea3c23844f3d157396f916ec11f2ef729dce00f4c29730735ad50bd90

See more details on using hashes here.

Provenance

The following attestation bundles were made for seq_miner-0.1.0.tar.gz:

Publisher: python-publish.yml on aeiwz/seq-miner

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file seq_miner-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: seq_miner-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 5.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for seq_miner-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 133306807bdfcb32cf5c72115ccc8093f208346f56271f010e2b97b1e663fc14
MD5 4bbc7d31f4d3ebd667cc40002211f3c8
BLAKE2b-256 c5b26449e968ba166b36f24ebf1c0a7b3d31ae913fe26fd3dd3a47bc193dbd47

See more details on using hashes here.

Provenance

The following attestation bundles were made for seq_miner-0.1.0-py3-none-any.whl:

Publisher: python-publish.yml on aeiwz/seq-miner

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page