Skip to main content

Probabilistic Graph Cuts with Hypergeometric Bounds

Project description

PGCuts: Probabilistic Graph Cuts

Differentiable graph cut objectives for unsupervised clustering on pre-computed embeddings. PGCuts provides tight hypergeometric upper bounds on the expected Normalized Cut, enabling gradient-based optimization without spectral decomposition.

Paper: Beyond Spectral Clustering: Probabilistic Cuts for Differentiable Graph Partitioning (AISTATS 2026)

Docs: pgcuts.readthedocs.io

Quick Start

from pgcuts import HyCut

# Like sklearn
labels = HyCut(n_clusters=10).fit_predict(X)

Works on any pre-computed embeddings (DINOv2, CLIP, etc.):

import numpy as np
from pgcuts import HyCut

X = np.load("dinov2_embeddings.npy")  # (50000, 1536)
model = HyCut(n_clusters=100, objective="hyp_ncut")
labels = model.fit_predict(X)

print(model.ncut_)  # normalized cut value
print(model.rcut_)  # ratio cut value

Installation

pip install pgcuts

Or from source:

git clone https://github.com/ayghri/pgcuts.git
cd pgcuts && pip install -e .

Requires Python >= 3.11, PyTorch >= 2.8 with CUDA, and Triton.

Algorithms

Three probabilistic graph cut objectives:

Objective Envelope What it optimizes
PRCut 1/p_bar Expected Ratio Cut / cluster size
H-RCut 2F1(-m, 1; 2; alpha) Hypergeometric bound on expected Ratio Cut
H-NCut Holder-binned 2F1 per degree bin Hypergeometric bound on expected Normalized Cut

For custom training loops, use the step functions directly:

from pgcuts.algorithms import prcut_step, hyp_rcut_step, hyp_ncut_step

Each takes a batch of edges and returns (cut_loss, balance_loss, updated_ema).

Key Components

Graph Construction

from pgcuts.graph import build_rbf_knn_graph
W = build_rbf_knn_graph(X, n_neighbors=50)

Hypergeometric Function (GPU-accelerated via Triton/CUDA)

from pgcuts import Hyp2F1
H = Hyp2F1.apply(-512, 1.0, 2.0, z)  # differentiable w.r.t. z

Gradient Mixer (prevents cluster collapse)

from pgcuts import GradientMixer

gm = GradientMixer(model.named_parameters(), loss_scale={"cut": 1.0, "balance": 1.0})
with gm("cut"):
    cut_loss.backward(retain_graph=True)
with gm("balance"):
    balance.backward()

Evaluation

from pgcuts import evaluate_clustering, compute_rcut_ncut

results = evaluate_clustering(y_true, y_pred, K)
rcut, ncut = compute_rcut_ncut(W, y_pred)

Reproducing Results

pip install "pgcuts[experiments]"

python scripts/benchmark.py --multirun \
    dataset=cifar10,cifar100,stl10,eurosat,imagenette,fashionmnist,mnist,pets,flowers,food101,resisc45,dtd,gtsrb,cub,aircraft \
    model=dinov2,dinov3b,clipvitL14 \
    objective=prcut,hyp_rcut,hyp_ncut

Results saved to results/ as JSON. Embeddings loaded via embedata.

Package Structure

pgcuts/
    cluster.py      # HyCut (sklearn-compatible API)
    algorithms/     # prcut_step, hyp_rcut_step, hyp_ncut_step
    hyp2f1/         # GPU 2F1 kernels (Triton + CUDA)
    losses/         # PRCut, H-RCut, H-NCut, FlashCut
    graph.py        # KNN graph construction
    metrics.py      # ACC, NMI, ARI, RCut, NCut
    optim.py        # GradientMixer
    utils/          # Edge sampling, data utilities

Citation

@inproceedings{ghriss2026pgcuts,
    title={Beyond Spectral Clustering: Probabilistic Cuts for Differentiable Graph Partitioning},
    author={Ghriss, Ayoub},
    booktitle={The 29th International Conference on Artificial Intelligence and Statistics (AISTATS)},
    year={2026}
}

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pgcuts-0.1.0.tar.gz (32.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pgcuts-0.1.0-py3-none-any.whl (39.7 kB view details)

Uploaded Python 3

File details

Details for the file pgcuts-0.1.0.tar.gz.

File metadata

  • Download URL: pgcuts-0.1.0.tar.gz
  • Upload date:
  • Size: 32.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.2 CPython/3.13.11 Linux/6.12.74-gentoo-x86_64

File hashes

Hashes for pgcuts-0.1.0.tar.gz
Algorithm Hash digest
SHA256 91530101d05845f75ad6b986348da03739add973a083267e30c8d1a11414ff41
MD5 cc93ca3994926d78fa2ee66b1be438fa
BLAKE2b-256 efadb4b65f5bb5ab86fcba0fd322738bb94ff84e9bb1460c8774482b73df5e15

See more details on using hashes here.

File details

Details for the file pgcuts-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pgcuts-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 39.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.2 CPython/3.13.11 Linux/6.12.74-gentoo-x86_64

File hashes

Hashes for pgcuts-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4da5f78cc53ba3e8067ed2d3f0e9c4da70da120e7a4a0311028277e949bc2944
MD5 3fb2449d67bfafdad3b0810c7569fca6
BLAKE2b-256 e75ef1e0fc8da1d1aed566896a14f3fb998ef664c875737f0a844b65191d457e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page