Skip to main content

Headless Venn diagram analysis and rendering — companion to the Venn Diagram Lab web tool.

Project description

venn-diagram-lab

CI PyPI version Python versions License: MIT

Headless Python companion to the Venn Diagram Lab web tool. Build, render, and statistically analyse Venn / UpSet diagrams from CSV / TSV / GMT / GMX inputs — same 44 SVG models, same intersection/Jaccard/hypergeometric statistics, same PDF report layout — but in a notebook, a Snakemake rule, or a CI job, with no browser involved.

Install

pip install venn-diagram-lab

That's it — all bundled SVG templates, sample datasets, and the vdl CLI ship with the wheel.

System deps (cairosvg): the PDF/PNG render path uses cairosvg, which needs the cairo native library. On Linux the wheel works out of the box once you have libcairo2. On macOS run brew install cairo pango. On Windows install the GTK3 runtime (cairo bundled). CI is currently Linux-only for this reason — see CHANGELOG.md "Known limitations".

For development (clone + editable):

git clone https://github.com/ZoliQua/Venn-Diagram-Lab.git
cd Venn-Diagram-Lab
python -m venv .venv && source .venv/bin/activate
python python/scripts/sync_data.py        # populates _data/ from the React side
pip install -e "python/[dev]"

Quickstart (30 seconds)

from venn_diagram_lab import load_sample, analyze

result = analyze(load_sample("dataset_real_cancer_drivers_4"))
print(result.set_sizes)
# {'Vogelstein': 138, 'COSMIC_CGC': 581, 'OncoKB': 1231, 'IntOGen': 633}

# Render the Venn diagram
result.render_venn().save("cancer_drivers.svg")

# Or a full multi-page PDF report
result.to_pdf_report("cancer_drivers_report.pdf")

Loading your own data

from venn_diagram_lab import load_csv, load_tsv, load_gmt, load_gmx, Dataset, analyze

# Binary 0/1 columns
ds = load_csv("genes.csv", binary=True)

# Aggregated (each column = a set, cells = item names)
ds = load_csv("pathways.csv", binary=False)

# GMT (one set per line)
ds = load_gmt("hallmark.gmt")

# In-memory dict
ds = Dataset.from_dict({
    "Set A": ["x", "y", "z"],
    "Set B": ["y", "z", "w"],
})

result = analyze(ds)

Visualisations

Method Output Best for
result.render_venn() SVG (vector) Publication; up to 9 sets via 44 bundled templates
result.render_venn(model='proportional') SVG Area-proportional; 2-3 sets only
result.render_upset() matplotlib Figure 5+ sets where Venn is hard to read
result.render_network() matplotlib Figure Pairwise relationships at a glance
result.to_pdf_report(path) Multi-page PDF One-shot publication-ready report

All visualisation methods accept the same kwargs as the underlying render.* functions — see their docstrings for full reference.

Statistics

stats = result.statistics       # lazy compute
print(stats.jaccard)            # square pandas DataFrame
print(stats.hypergeometric)     # long-form: pair, intersection, expected, p_value, p_adjusted, ...

compute_pairwise produces 5 metric tables: Jaccard, Sørensen-Dice, Overlap Coefficient, Fold Enrichment, and the hypergeometric long-form (with Benjamini-Hochberg FDR correction).

Export to TSV (matches the web tool byte-for-byte)

result.to_region_summary_tsv("regions.tsv")     # depth-sorted region table
result.to_matrix_tsv("matrix.tsv")              # one row per item with set membership
result.to_statistics_tsv("statistics.tsv")      # pairwise stats with FDR

These match the React web tool's three Export buttons exactly — including float formatting and spreadsheet-formula escaping. The Phase 7 parity tests (pytest python/tests/test_parity_with_webapp.py) prove this for all 5 bundled samples.

Command-line interface

The wheel installs a vdl console script:

Command Purpose
vdl version Print the package version
vdl list-models Table of the 44 bundled SVG models
vdl list-samples Table of bundled sample datasets
vdl analyze <input> [--model M] [--mode binary|aggregated] [--format csv|tsv|gmt|gmx] [--output-dir D] [--venn FILE] [--upset FILE] [--network FILE] [--pdf FILE] [--statistics-tsv FILE] Main entry point: load, analyse, optionally write outputs
vdl render-sample <name> [...same output flags...] Bundled-sample shortcut

Without any output flags, both commands print a Rich-styled summary table. With --output-dir, all five outputs (svg, png upset, png network, pdf, tsv) are written.

Notebook gallery

Eight executable notebooks live under python/examples/:

# Notebook Topic
01 01_quickstart.ipynb First analysis in 10 cells
02 02_real_cancer_drivers.ipynb Biological walkthrough (cancer driver catalogs)
03 03_proportional_diagrams.ipynb Area-proportional 2/3-set demos
04 04_upset_vs_venn_vs_network.ipynb Choosing the right visualisation
05 05_statistics_deep_dive.ipynb Jaccard / Dice / Hypergeometric / BH-FDR
06 06_pipeline_integration.ipynb Snakemake + Nextflow templates
07 07_pdf_reports.ipynb Multi-page PDF reports
08 08_custom_styling_and_export.ipynb lxml SVG post-processing + multi-format export

Each notebook is built from a python/scripts/notebooks/_build_NN_*.py script and executed nightly on CI to prevent bit-rot.

Bundled sample datasets

Name Sets Items Source
dataset_real_cancer_drivers_4 4 1394 Vogelstein / COSMIC CGC / OncoKB / IntOGen catalogs
dataset_real_msigdb_cancer_pathways 5 777 MSigDB Hallmark cancer pathways
dataset_real_msigdb_immune_pathways 4 521 MSigDB Hallmark immune pathways
dataset_mock_gene_sets 6 3288 Synthetic for demos
dataset_mock_streaming_platforms 8 800 TV/movie titles across 8 streaming services
from venn_diagram_lab import list_samples, load_sample
list_samples()
ds = load_sample("dataset_real_cancer_drivers_4")

Contributing

The repo monorepos the React web tool and this Python package. After cloning:

cd Venn-Diagram-Lab
python -m venv .venv && source .venv/bin/activate
python python/scripts/sync_data.py
pip install -e "python/[dev]"
pytest python/tests/ -q

Run the slow notebook suite (~3 min):

pytest python/tests/test_notebooks.py -m slow

Regenerate the parity-test fixtures (requires Node 20+):

npm install
npm run fixtures:parity

Conventional commit prefixes used: feat(python):, fix(python):, chore(python):, docs(python):, test(python):.

Versioning

Strict SemVer. Pre-1.0 minor bumps may include behavior changes; see CHANGELOG.md.

License

MIT — see LICENSE.

Citation

If you use this package in research, please cite:

Dul Z., Ölbei M., Thomas N. S. B., Si Ammour A., Csikász-Nagy A. (2026).
Venn Diagram Lab — interactive Venn / UpSet diagrams (v2.0.0).
https://www.venndiagramlab.org/
DOI: 10.5281/zenodo.20000599

DOI

See CITATION.cff for machine-readable citation metadata.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

venn_diagram_lab-2.0.2.tar.gz (4.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

venn_diagram_lab-2.0.2-py3-none-any.whl (4.2 MB view details)

Uploaded Python 3

File details

Details for the file venn_diagram_lab-2.0.2.tar.gz.

File metadata

  • Download URL: venn_diagram_lab-2.0.2.tar.gz
  • Upload date:
  • Size: 4.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for venn_diagram_lab-2.0.2.tar.gz
Algorithm Hash digest
SHA256 ac81dca77274ade38c1174f8f09f34d8687f4eb887db38f973b71102d8045958
MD5 6677ba731dbb14423d9c08062fb83b55
BLAKE2b-256 da62d50d9dbef23b98f28d52540edc3d92bfa021a1ec94df1537d279bdccf6a0

See more details on using hashes here.

Provenance

The following attestation bundles were made for venn_diagram_lab-2.0.2.tar.gz:

Publisher: python-publish.yml on ZoliQua/Venn-Diagram-Lab

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file venn_diagram_lab-2.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for venn_diagram_lab-2.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 195086e822380bf70cf8c7e837544e7f440b7ad0e59d7dbbaf220de0fd89e004
MD5 6b72f9bfedcc6f8e8ac230e723cebd84
BLAKE2b-256 7a1a892f4cd68c6d7894e1580e1e90fe2e5d53cfc45c05c801f531433a224052

See more details on using hashes here.

Provenance

The following attestation bundles were made for venn_diagram_lab-2.0.2-py3-none-any.whl:

Publisher: python-publish.yml on ZoliQua/Venn-Diagram-Lab

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page