Skip to main content

Streaming editor for SAM, BAM, and CRAM files.

Project description

edit-sam: edit and analyze SAM/BAM/CRAM files

A high-performance utility for executing arbitrary Python logic on SAM/BAM/CRAM alignment streams. It enables in-flight header modification, tag injection, filtering, and sidecar metadata extraction without breaking unix pipes.

Key Features

  • Hot-Injected Logic: Execute multiple Python statements on every pysam.AlignedSegment object.
  • Persistent Namespace: Variables created in one alignment iteration persist to the next, allowing for stateful counters or cross-read comparisons.
  • Header Provenance: Automatically appends a @PG line to the SAM header containing the exact command-line execution string for reproducibility.
  • Zero-Copy Sidecars: Export specific Python variables to a JSON Lines (JSONL) file for downstream analysis without re-parsing the BAM.
  • Performance Optimized: Native support for htslib multi-threading and uncompressed BAM streaming (-u) for high-speed piping.

Gemini said BAM Stream Processor (stream.py) A high-performance utility for executing arbitrary Python logic on SAM/BAM/CRAM alignment streams. It enables in-flight header modification, tag injection, filtering, and sidecar metadata extraction without breaking unix pipes.

Key Features Hot-Injected Logic: Execute multiple Python statements on every pysam.AlignedSegment object.

Persistent Namespace: Variables created in one alignment iteration persist to the next, allowing for stateful counters or cross-read comparisons.

Header Provenance: Automatically appends a @PG line to the SAM header containing the exact command-line execution string for reproducibility.

Zero-Copy Sidecars: Export specific Python variables to a JSON Lines (JSONL) file for downstream analysis without re-parsing the BAM.

Performance Optimized: Native support for htslib multi-threading and uncompressed BAM streaming (-u) for high-speed piping.

Installation

pip install pysam click edit-sam

Usage Syntax

edit-sam [OPTIONS] "COMMAND_1" "COMMAND_2" ...

Core Options

  • -i, --input: Path to input (defaults to stdin).
  • -o, --output: Path to output (defaults to stdout).
  • -O, --output-format: Force SAM, BAM, or CRAM.
  • -u, --uncompressed: Disable compression for faster piping.
  • -l, --locals-out: Path to write exported variables as JSONL.
  • -k, --export-key: Variable names to include in the JSONL output.

Examples

1. Composite Tag Injection

Concatenate CB and UB tags into a single XZ tag for multi-factor deduplication.

edit-sam -i in.bam -o out.bam \
"cb = seg.get_tag('CB') if seg.has_tag('CB') else ''" \
"ub = seg.get_tag('UB') if seg.has_tag('UB') else ''" \
"seg.set_tag('XZ', f'{cb}-{ub}') if cb and ub else None"

2. QC Metrics Extraction

Extract mapping quality to a sidecar file.

edit-sam -i in.bam -l mq_report.json -k all_mqs \
"if 'all_mqs' not in locals(): all_mqs = []" \
"all_mqs.append(seg.mapping_quality)" \

3. Regex Read Filtering

Use the built-in re module to modify flags based on read-name patterns.

edit-sam -i in.bam -o - -u \
"if re.search(r'^[A-Z]00', seg.query_name): seg.is_qcfail = True"

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

edit_sam-0.1.0.tar.gz (3.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

edit_sam-0.1.0-py3-none-any.whl (4.7 kB view details)

Uploaded Python 3

File details

Details for the file edit_sam-0.1.0.tar.gz.

File metadata

  • Download URL: edit_sam-0.1.0.tar.gz
  • Upload date:
  • Size: 3.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for edit_sam-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7d678af204a93a4208615fb4cb132fe42752f22976b20f04dfddfc4c2b3d11b8
MD5 51879293483e52f4b693369151dc3475
BLAKE2b-256 a818bc97e71903ee7d88487c753d325c217d4c6f0dac64863edd91c97acbfd97

See more details on using hashes here.

File details

Details for the file edit_sam-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: edit_sam-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 4.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for edit_sam-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 78ba04fcd488673e4c388297a7632537354c8c3bfc019e8bdee6c2ca90cd4112
MD5 df110b8b172f28664b17485bade5e78d
BLAKE2b-256 96b543834f48f2c25356d909202b7aa9d274b538f027591dda7c655afc716ac8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page