Missing-data matrix for RNA-Seq and proteomics QC

These details have not been verified by PyPI

Project links

Repository

Project description

mismap-qc

A prettier missing-data matrix for RNA-Seq QC, inspired by missingno. Shows which genes are detected vs missing across samples, with multi-level colour annotations and hierarchical clustering.

demo

Quick start

No virtual environment needed -- uses PEP 723 inline script dependencies with uv.

uv run demo.py

Or import directly:

import pandas as pd
from mismap_qc import missing_matrix

df = pd.read_csv("data/toy_rnaseq.csv", index_col=0, header=[0, 1, 2])
fig = missing_matrix(df, title="Gene Detection Matrix")

Input format

A pandas DataFrame with:

Rows = genes (or any features)
Columns = samples, optionally as a MultiIndex for annotation strips
NaN = missing / not detected

When columns are a MultiIndex, level names automatically become annotation strip labels.

`missing_matrix()` -- static plot

fig = missing_matrix(
    df,
    title="Gene Detection Matrix",
    subtitle="80 genes x 30 samples | 23% missing",
    save="output.png",
)

Layout (top to bottom)

Component	Description
Title + subtitle	Bold title, italic subtitle for metadata
Dendrogram	Hierarchical clustering of samples by nullity pattern
Annotation strips	One colour bar per MultiIndex column level
Nullity matrix	Dark = detected, light = missing
Completeness sparkline	Per-sample or per-gene detection rate

Parameters

Data & labels

Parameter	Type	Default	Description
`df`	`DataFrame`	required	Genes (rows) x samples (columns). NaN = missing.
`title`	`str`	`""`	Bold figure title
`subtitle`	`str`	`""`	Italic line below title (e.g. dataset metadata)
`label_level`	`int`	`-1`	Which column level to use for x-axis tick labels

Clustering & sorting

Parameter	Type	Default	Description
`cluster_samples`	`bool`	`True`	Cluster samples by binary nullity pattern
`cluster_method`	`str`	`"average"`	scipy linkage method
`show_dendrogram`	`bool`	`True`	Show dendrogram above the matrix
`sort_genes`	`str \| None`	`"descending"`	Sort genes by completeness (`"ascending"`, `"descending"`, or `None`)

Annotations

Parameter	Type	Default	Description
`annotation_levels`	`list[int] \| None`	`None`	Column levels to show as colour bars (default: all except innermost)
`annotation_colors`	`dict \| None`	`None`	Custom colours per level (see below)

Custom annotation colours accept level indices or names as keys:

missing_matrix(
    df,
    annotation_colors={
        "Medium_Type": {"Fresh": "#88CCEE", "Conditioned": "#CC6677"},
        "Medium_Condition": {"SF": "#44AA99", "FBS": "#DDCC77", "AS": "#AA4499"},
    },
)

Unspecified factor levels fall back to built-in palettes.

Completeness sparkline

Parameter	Type	Default	Description
`completeness`	`str`	`"below"`	`"below"` = per-sample (horizontal), `"side"` = per-gene (vertical)
`completeness_threshold`	`float \| None`	`None`	Draws a dashed red line at this value (0--1)

Legends & layout

Parameter	Type	Default	Description
`legend_loc`	`str`	`"upper right"`	Corner for legends: `"upper right"`, `"upper left"`, `"lower right"`, `"lower left"`
`figsize`	`tuple \| None`	`None`	Figure size (auto-calculated if `None`)
`color_present`	`str`	`"#2d2d2d"`	Colour for detected cells
`color_missing`	`str`	`"#f0f0f0"`	Colour for missing cells

Font sizes

Parameter	Type	Default	Description
`fontsize`	`int`	`10`	Base font size (fallback)
`fontsize_legend`	`int \| None`	`None`	Legend entries
`fontsize_rows`	`int \| None`	`None`	Gene/row labels
`fontsize_cols`	`int \| None`	`None`	Sample/column labels
`fontsize_annotations`	`int \| None`	`None`	Annotation strip labels

Group summary

Parameter	Type	Default	Description
`group_summary`	`int \| str \| None`	`None`	Column level to group by; prints per-group completeness to console

fig = missing_matrix(df, group_summary="Medium_Condition")

Output:

Group Completeness (Medium_Condition)
--------------------------------
  SF               63%  (n=10)
  AS               80%  (n=10)
  FBS              88%  (n=10)

Only prints when the level has more than one group.

Split by factor

Parameter	Type	Default	Description
`split_by`	`int \| str \| None`	`None`	Split into side-by-side panels by this column level

fig = missing_matrix(df, split_by="Medium_Condition", annotation_levels=[0])

split

Each panel is independently clustered. The split level is automatically removed from annotation strips.

Output

Parameter	Type	Default	Description
`save`	`str \| None`	`None`	Save figure to this path
`dpi`	`int`	`150`	Save resolution

`missing_matrix_html()` -- interactive HTML

Plotly-based interactive version with hover tooltips showing gene name, sample ID, all annotation levels, and detection status.

from mismap_qc import missing_matrix_html

missing_matrix_html(
    df,
    title="Gene Detection Matrix (Interactive)",
    subtitle="80 genes x 30 samples",
    completeness_threshold=0.5,
    save="output/interactive.html",
)

Supports the same clustering, sorting, annotation, and completeness options as the static version. Additional parameters:

Parameter	Type	Default	Description
`width`	`int \| None`	`None`	Plot width in pixels (auto-calculated if `None`)
`height`	`int \| None`	`None`	Plot height in pixels (auto-calculated if `None`)

Requires plotly (pip install plotly or included via PEP 723 in demo.py).

Generating toy data

uv run make_toy_data.py

Creates data/toy_rnaseq.csv: 80 genes x 30 samples with structured missingness patterns across 6 groups (Fresh/Conditioned x SF/FBS/AS).

Dependencies

numpy
matplotlib
scipy
pandas
plotly (optional, for HTML export only)

Project details

These details have not been verified by PyPI

Project links

Repository

Release history Release notifications | RSS feed

This version

0.1.0

Mar 31, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mismap_qc-0.1.0.tar.gz (484.5 kB view details)

Uploaded Mar 31, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mismap_qc-0.1.0-py3-none-any.whl (13.7 kB view details)

Uploaded Mar 31, 2026 Python 3

File details

Details for the file mismap_qc-0.1.0.tar.gz.

File metadata

Download URL: mismap_qc-0.1.0.tar.gz
Upload date: Mar 31, 2026
Size: 484.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mismap_qc-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`738c9ccf5e46aec40447621b30be85b347c88f65ebcacc9f3e6ecc153a77645a`
MD5	`e4bd4a0ff14fd5919710a15aede09ae7`
BLAKE2b-256	`2267e7f8aaf1a3e4a1845790bb15f76aa5a22829b7f6e1eb8aca126a54ab7594`

See more details on using hashes here.

File details

Details for the file mismap_qc-0.1.0-py3-none-any.whl.

File metadata

Download URL: mismap_qc-0.1.0-py3-none-any.whl
Upload date: Mar 31, 2026
Size: 13.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mismap_qc-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cc76e79094f85dcdf4b06a0efc1e374b353e9407e602f126c35717b5f6dfacc8`
MD5	`a713b6cc959662bcd397105083c8e8b8`
BLAKE2b-256	`5c1bad3af6632dd3c67cf72f38e4700035c1b9095da327fed924e5205e8ba177`

See more details on using hashes here.

mismap-qc 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

mismap-qc

Quick start

Input format

missing_matrix() -- static plot

Layout (top to bottom)

Parameters

Data & labels

Clustering & sorting

Annotations

Completeness sparkline

Legends & layout

Font sizes

Group summary

Split by factor

Output

missing_matrix_html() -- interactive HTML

Generating toy data

Dependencies

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`missing_matrix()` -- static plot

`missing_matrix_html()` -- interactive HTML