Skip to main content

Hierarchical agglomerative clustering with soft constraints (SciPy-compatible Z).

Project description

Constrained Hierarchical Agglomerative Clustering

This repository contains the implementation of the constrained linkage function for Constrained Hierarchical Agglomerative Clustering from the paper:

HEAT: Hierarchical-constrained Encoder-Assisted Time series clustering for fault detection in district heating substations
Jonne van Dreven, Abbas Cheddad, Ahmad Nauman Ghazi, Sadi Alawadi, Jad Al Koussa, Dirk Vanhoudt
Energy and AI, 21 (2025), 100548
DOI: 10.1016/j.egyai.2025.100548

If you use this library in academic or scientific work, please cite:

@article{van_Dreven-HEAT,
  title={HEAT: Hierarchical-constrained Encoder-Assisted Time series clustering for fault detection in district heating substations},
  volume={21},
  ISSN={2666-5468},
  DOI={10.1016/j.egyai.2025.100548},
  journal={Energy and AI},
  author={van Dreven, Jonne and Cheddad, Abbas and Ghazi, Ahmad Nauman and Alawadi, Sadi and Al Koussa, Jad and Vanhoudt, Dirk},
  year={2025},
  month=sep,
  pages={100548}
}

A NumPy-only hierarchical agglomerative clustering routine with soft constraints, returning a SciPy-compatible linkage matrix Z.

✨ Features

  • Drop-in replacement for a constrained linkage routine supporting:
    • single, complete, average, weighted, centroid, median, ward
  • Accepts either:
    • condensed 1-D distances (len n*(n-1)/2)
    • n×n square distance matrix
  • Adds soft constraints:
    • Must-link / Cannot-link via a constraint matrix M
      • M[i,j] < 0 → encourages merging (must-link)
      • M[i,j] > 0 → discourages merging (cannot-link)
    • Min/max cluster size penalties (linear in violation amount)
  • No SciPy dependency — output Z works with SciPy’s downstream tools.

🔧 Install

pip install constrained-linkage
# from source:
pip install "git+https://github.com/jonnevd/constrained-linkage"

🚀 Usage Example

import numpy as np
from constrained_linkage import constrained_linkage
from scipy.cluster import hierarchy as hierarchy
from scipy.spatial.distance import squareform

# ==== Example 1: Using a constraint matrix ====

# 4 points in 1D space
X = np.array([[0.0], [0.1], [10.0], [10.1]])
D = np.sqrt(((X[:, None, :] - X[None, :, :]) ** 2).sum(-1))

# Constraint matrix: discourage merging points 0 and 1 (shouldnot-link)
M = np.zeros_like(D)
M[0, 1] = M[1, 0] = 1.0   # Positive values discourage merges
# Could also use negative values to encourage must-link merges

# Run constrained linkage
Z_con = constrained_linkage(
    D, method="average", 
    constraint_matrix=M, 
    normalize_distances=True
)

# Cluster into 2 groups
labels_con = hierarchy.fcluster(Z_con, 2, criterion="maxclust")
print("Cluster labels (with shouldnot-link constraint):", labels_con)


# ==== Example 2: Enforcing a maximum cluster size ====

# 6 points in 1D space (three tight pairs)
X = np.array([[0.0], [0.1], [5.0], [5.1], [10.0], [10.1]])
D = np.sqrt(((X[:, None, :] - X[None, :, :]) ** 2).sum(-1))

# Run constrained linkage with max cluster size = 2
Z_max_size = constrained_linkage(
    D, method="average",
    max_cluster_size=2,
    max_penalty_weight=0.5,
    normalize_distances=True
)

# Cluster into 3 groups (will respect size limit)
labels_max = hierarchy.fcluster(Z_max_size, 3, criterion="maxclust")
print("Cluster labels (with max size = 2):", labels_max)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

constrained_linkage-0.1.2.tar.gz (6.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

constrained_linkage-0.1.2-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file constrained_linkage-0.1.2.tar.gz.

File metadata

  • Download URL: constrained_linkage-0.1.2.tar.gz
  • Upload date:
  • Size: 6.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.6

File hashes

Hashes for constrained_linkage-0.1.2.tar.gz
Algorithm Hash digest
SHA256 19459ff71b0bec548daedfad0dc4aec0a8aa68748bc655d49fbce5a70ff54932
MD5 b6252c9bb7aea7efb7e37837f84268d6
BLAKE2b-256 da2df288fc3e83eb722ee945a0d230b28442e51370b54a2ff6ee1570d6734f6e

See more details on using hashes here.

File details

Details for the file constrained_linkage-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for constrained_linkage-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9d9f9c2656c8d1f91e098670cf052e62cb4506aa7e24b5038a416446aac1a918
MD5 0f77dbd45129c638b9daa2dc99199c5f
BLAKE2b-256 5036c51122d68d519b8a998446f2d960445d4802dbbb7e0721fc3a2448ecf262

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page