Streaming editor for SAM, BAM, and CRAM files.
Project description
edit-sam: edit and analyze SAM/BAM/CRAM files
A high-performance utility for executing arbitrary Python logic on SAM/BAM/CRAM alignment streams. It enables in-flight header modification, tag injection, filtering, and sidecar metadata extraction without breaking unix pipes.
Key Features
- Hot-Injected Logic: Execute multiple Python statements on every pysam.AlignedSegment object.
- Persistent Namespace: Variables created in one alignment iteration persist to the next, allowing for stateful counters or cross-read comparisons.
- Header Provenance: Automatically appends a @PG line to the SAM header containing the exact command-line execution string for reproducibility.
- Zero-Copy Sidecars: Export specific Python variables to a JSON Lines (JSONL) file for downstream analysis without re-parsing the BAM.
- Performance Optimized: Native support for htslib multi-threading and uncompressed BAM streaming (-u) for high-speed piping.
Gemini said BAM Stream Processor (stream.py) A high-performance utility for executing arbitrary Python logic on SAM/BAM/CRAM alignment streams. It enables in-flight header modification, tag injection, filtering, and sidecar metadata extraction without breaking unix pipes.
Key Features Hot-Injected Logic: Execute multiple Python statements on every pysam.AlignedSegment object.
Persistent Namespace: Variables created in one alignment iteration persist to the next, allowing for stateful counters or cross-read comparisons.
Header Provenance: Automatically appends a @PG line to the SAM header containing the exact command-line execution string for reproducibility.
Zero-Copy Sidecars: Export specific Python variables to a JSON Lines (JSONL) file for downstream analysis without re-parsing the BAM.
Performance Optimized: Native support for htslib multi-threading and uncompressed BAM streaming (-u) for high-speed piping.
Installation
pip install pysam click edit-sam
Usage Syntax
edit-sam [OPTIONS] "COMMAND_1" "COMMAND_2" ...
Core Options
- -i, --input: Path to input (defaults to stdin).
- -o, --output: Path to output (defaults to stdout).
- -O, --output-format: Force SAM, BAM, or CRAM.
- -u, --uncompressed: Disable compression for faster piping.
- -l, --locals-out: Path to write exported variables as JSONL.
- -k, --export-key: Variable names to include in the JSONL output.
Examples
1. Composite Tag Injection
Concatenate CB and UB tags into a single XZ tag for multi-factor deduplication.
edit-sam -i in.bam -o out.bam \
"cb = seg.get_tag('CB') if seg.has_tag('CB') else ''" \
"ub = seg.get_tag('UB') if seg.has_tag('UB') else ''" \
"seg.set_tag('XZ', f'{cb}-{ub}') if cb and ub else None"
2. QC Metrics Extraction
Extract mapping quality to a sidecar file.
edit-sam -i in.bam -l mq_report.json -k all_mqs \
"if 'all_mqs' not in locals(): all_mqs = []" \
"all_mqs.append(seg.mapping_quality)" \
3. Regex Read Filtering
Use the built-in re module to modify flags based on read-name patterns.
edit-sam -i in.bam -o - -u \
"if re.search(r'^[A-Z]00', seg.query_name): seg.is_qcfail = True"
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file edit_sam-0.1.0.tar.gz.
File metadata
- Download URL: edit_sam-0.1.0.tar.gz
- Upload date:
- Size: 3.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7d678af204a93a4208615fb4cb132fe42752f22976b20f04dfddfc4c2b3d11b8
|
|
| MD5 |
51879293483e52f4b693369151dc3475
|
|
| BLAKE2b-256 |
a818bc97e71903ee7d88487c753d325c217d4c6f0dac64863edd91c97acbfd97
|
File details
Details for the file edit_sam-0.1.0-py3-none-any.whl.
File metadata
- Download URL: edit_sam-0.1.0-py3-none-any.whl
- Upload date:
- Size: 4.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
78ba04fcd488673e4c388297a7632537354c8c3bfc019e8bdee6c2ca90cd4112
|
|
| MD5 |
df110b8b172f28664b17485bade5e78d
|
|
| BLAKE2b-256 |
96b543834f48f2c25356d909202b7aa9d274b538f027591dda7c655afc716ac8
|