Skip to main content

Python parser for Illumina MiSeq InterOp binary files

Project description

Miseq Binary Parser

coverage report

This module is built to replace the Illuminate package, which has not seen an update in over 7 years.

The foundation of this code is based on the MiCall project's Error Metrics Parser, Quality Metrics Parser, and Tile Metrics Parser.

For an indepth breakdown of binary formats used, see the Illumina binary formats page.

Installation

pip install miseqinteropreader

Or with uv:

uv add miseqinteropreader

For development, clone the repository and install with dev dependencies:

git clone <repository-url>
cd miseqinteropreader
uv sync --all-extras

Command-Line Interface

The package includes a powerful CLI tool called miseq-interop for analyzing MiSeq InterOp metrics without writing code.

Available Commands

validate - Validate Run Directory

Check if a run directory is valid and see which metrics are available:

miseq-interop validate /path/to/run

# Example output:
# ✓ Run directory exists: 240101_M12345_0001_000000000-ABCDE
# ✓ InterOp directory found
# ✓ SampleSheet.csv found
# ✓ Marker: needsprocessing
# ✓ Marker: qc_uploaded
#
# Available metrics:
# ✓ ERROR_METRICS
# ✓ QUALITY_METRICS
# ✓ TILE_METRICS
# ✗ COLLAPSED_Q_METRICS (missing)

info - Display Run Information

Show quick statistics and metadata about a run:

miseq-interop info /path/to/run

# Example output:
# Run: 240101_M12345_0001_000000000-ABCDE
# Status: QC Uploaded, Needs Processing
# Metrics available: 8/11
# Total records: 45,232
# Lanes: 1 (range: 1-1)
# Tiles: 19
# Cycles: 301 (max: 301)

Add -v for verbose output with file sizes:

miseq-interop info /path/to/run -v

summary - Generate Quality Summaries

Generate summary statistics for quality, tiles, and error metrics:

# Get quality summary (Q30 scores)
miseq-interop summary /path/to/run --quality

# Get tile summary (cluster density, pass rate)
miseq-interop summary /path/to/run --tiles

# Get error rate summary (phiX)
miseq-interop summary /path/to/run --errors

# Get all summaries
miseq-interop summary /path/to/run --all

# Specify read lengths for proper forward/reverse separation
miseq-interop summary /path/to/run --all --read-lengths 150,8,8,150

# Export to JSON
miseq-interop summary /path/to/run --all --format json -o summary.json

# Export to CSV
miseq-interop summary /path/to/run --all --format csv -o summary.csv

extract - Extract Metrics to Files

Export raw metric data to various formats:

# Extract specific metrics to JSON
miseq-interop extract /path/to/run --metrics ERROR_METRICS QUALITY_METRICS --format json -o output_dir/

# Extract all available metrics to CSV
miseq-interop extract /path/to/run --all --format csv -o metrics/

# Extract to Parquet format (requires pandas)
miseq-interop extract /path/to/run --metrics QUALITY_METRICS --format parquet -o quality.parquet

# Extract single metric to stdout
miseq-interop extract /path/to/run --metrics ERROR_METRICS --format json

Available metrics:

  • ERROR_METRICS - PhiX error rates by cycle
  • QUALITY_METRICS - Q-score distributions
  • TILE_METRICS - Cluster density and counts
  • EXTENDED_TILE_METRICS - Extended tile information
  • EXTRACTION_METRICS - Focus and intensity metrics
  • IMAGE_METRICS - Image contrast metrics
  • PHASING_METRICS - Phasing/prephasing weights
  • CORRECTED_INTENSITY_METRICS - Corrected intensities
  • COLLAPSED_Q_METRICS - Collapsed quality bins (Q20/Q30)
  • INDEX_METRICS - Index read information

Example Workflows

QC Pipeline Integration:

# Validate run before processing
miseq-interop validate /path/to/run && \
  miseq-interop summary /path/to/run --all --format json -o qc_metrics.json

Quick QC Check:

# Get key metrics for a run
miseq-interop info /path/to/run
miseq-interop summary /path/to/run --quality --tiles

Python API

You can also use the package programmatically in Python:

from pathlib import Path
from miseqinteropreader import InterOpReader, MetricFile

# Initialize reader
reader = InterOpReader("/path/to/run")

# Check available files
reader.check_files_present([MetricFile.ERROR_METRICS, MetricFile.QUALITY_METRICS])

# Read metric data
error_records = reader.read_file(MetricFile.ERROR_METRICS)
quality_records = reader.read_file(MetricFile.QUALITY_METRICS)

# Get as pandas DataFrame
error_df = reader.read_file_to_dataframe(MetricFile.ERROR_METRICS)

# Generate summaries
quality_summary = reader.summarize_quality_records(
    quality_records,
    read_lengths=(150, 16, 150)  # forward, indexes, reverse
)

print(f"Q30 Forward: {quality_summary.q30_forward:.2%}")
print(f"Q30 Reverse: {quality_summary.q30_reverse:.2%}")

tile_records = reader.read_file(MetricFile.TILE_METRICS)
tile_summary = reader.summarize_tile_records(tile_records)

print(f"Cluster Density: {tile_summary.cluster_density:.2f} K/mm²")
print(f"Pass Rate: {tile_summary.pass_rate:.2%}")

Development

Run tests:

uv run pytest

Run with coverage:

uv run pytest --cov=src/miseqinteropreader

Run type checking:

uv run mypy .

Run linting:

uv run ruff check .

License

See LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

miseqinteropreader-1.1.2.tar.gz (57.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

miseqinteropreader-1.1.2-py3-none-any.whl (26.9 kB view details)

Uploaded Python 3

File details

Details for the file miseqinteropreader-1.1.2.tar.gz.

File metadata

  • Download URL: miseqinteropreader-1.1.2.tar.gz
  • Upload date:
  • Size: 57.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for miseqinteropreader-1.1.2.tar.gz
Algorithm Hash digest
SHA256 aaaa790de497bede5fc28f7ca7e16ea13ade9f938baad3508ad85b252ecaa592
MD5 678f229a032e3262fabaccb09b38406f
BLAKE2b-256 7f717c5dfee3adcb2b4758a915ee6172975fc5dc18ef76c23a64231408309ed2

See more details on using hashes here.

File details

Details for the file miseqinteropreader-1.1.2-py3-none-any.whl.

File metadata

  • Download URL: miseqinteropreader-1.1.2-py3-none-any.whl
  • Upload date:
  • Size: 26.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for miseqinteropreader-1.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b9903c39e45dbe30be712f23f3d6dc170faad3848d702a229f9876cb5edc7fd4
MD5 5fc5da31ff7e6c40a701f03ce06cede4
BLAKE2b-256 4f39329af25db2c3633e8b6d88e6659b6eb0a29e3721012d749795068116a157

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page