Skip to main content

Hierarchical agglomerative clustering with soft constraints (SciPy-compatible Z).

Project description

Constrained Hierarchical Agglomerative Clustering

This repository contains the implementation of the constrained linkage function for Constrained Hierarchical Agglomerative Clustering from the paper:

HEAT: Hierarchical-constrained Encoder-Assisted Time series clustering for fault detection in district heating substations
Jonne van Dreven, Abbas Cheddad, Ahmad Nauman Ghazi, Sadi Alawadi, Jad Al Koussa, Dirk Vanhoudt
Energy and AI, 21 (2025), 100548
DOI: 10.1016/j.egyai.2025.100548

If you use this library in academic or scientific work, please cite:

@article{van_Dreven-HEAT,
  title={HEAT: Hierarchical-constrained Encoder-Assisted Time series clustering for fault detection in district heating substations},
  volume={21},
  ISSN={2666-5468},
  DOI={10.1016/j.egyai.2025.100548},
  journal={Energy and AI},
  author={van Dreven, Jonne and Cheddad, Abbas and Ghazi, Ahmad Nauman and Alawadi, Sadi and Al Koussa, Jad and Vanhoudt, Dirk},
  year={2025},
  month=sep,
  pages={100548}
}

A NumPy-only hierarchical agglomerative clustering routine with soft constraints, returning a SciPy-compatible linkage matrix Z.

✨ Features

  • Drop-in replacement for a constrained linkage routine supporting:
    • single, complete, average, weighted, centroid, median, ward
  • Accepts either:
    • condensed 1-D distances (len n*(n-1)/2)
    • n×n square distance matrix
  • Adds soft constraints:
    • Must-link / Cannot-link via a constraint matrix M
      • M[i,j] < 0 → encourages merging (must-link)
      • M[i,j] > 0 → discourages merging (cannot-link)
    • Min/max cluster size penalties (linear in violation amount)
  • No SciPy dependency — output Z works with SciPy’s downstream tools.

🔧 Install

# from source:
pip install "git+https://github.com/jonnevd/constrained-linkage"

🚀 Usage Example

import numpy as np
from constrained_linkage import constrained_linkage
from scipy.cluster import hierarchy as hierarchy
from scipy.spatial.distance import squareform

# ==== Example 1: Using a constraint matrix ====

# 4 points in 1D space
X = np.array([[0.0], [0.1], [10.0], [10.1]])
D = np.sqrt(((X[:, None, :] - X[None, :, :]) ** 2).sum(-1))

# Constraint matrix: discourage merging points 0 and 1 (shouldnot-link)
M = np.zeros_like(D)
M[0, 1] = M[1, 0] = 1.0   # Positive values discourage merges
# Could also use negative values to encourage must-link merges

# Run constrained linkage
Z_con = constrained_linkage(
    D, method="average", 
    constraint_matrix=M, 
    normalize_distances=True
)

# Cluster into 2 groups
labels_con = hierarchy.fcluster(Z_con, 2, criterion="maxclust")
print("Cluster labels (with shouldnot-link constraint):", labels_con)


# ==== Example 2: Enforcing a maximum cluster size ====

# 6 points in 1D space (three tight pairs)
X = np.array([[0.0], [0.1], [5.0], [5.1], [10.0], [10.1]])
D = np.sqrt(((X[:, None, :] - X[None, :, :]) ** 2).sum(-1))

# Run constrained linkage with max cluster size = 2
Z_max_size = constrained_linkage(
    D, method="average",
    max_cluster_size=2,
    max_penalty_weight=0.5,
    normalize_distances=True
)

# Cluster into 3 groups (will respect size limit)
labels_max = hierarchy.fcluster(Z_max_size, 3, criterion="maxclust")
print("Cluster labels (with max size = 2):", labels_max)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

constrained_linkage-0.1.1.tar.gz (6.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

constrained_linkage-0.1.1-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file constrained_linkage-0.1.1.tar.gz.

File metadata

  • Download URL: constrained_linkage-0.1.1.tar.gz
  • Upload date:
  • Size: 6.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.6

File hashes

Hashes for constrained_linkage-0.1.1.tar.gz
Algorithm Hash digest
SHA256 2d2aaa85f8edc4877de48b7a2d6a5e7342d4e3992951608210559ac618b01e56
MD5 d8e09f7ed2207a770d6a4936d18a845a
BLAKE2b-256 9b5a232716ee681f01e0864a07d47275ca4cada7c4db3d83103468854f3e07fb

See more details on using hashes here.

File details

Details for the file constrained_linkage-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for constrained_linkage-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 752cd3718364ed3de562390a51e1ed791c0baef12d7bdaec97dc942c8c7bdef0
MD5 ee2eaa2519c02da55178f8276cf08b9d
BLAKE2b-256 52b5f35670e70cecc5150a0536de7cb6dbf872ea77f5d21a60014f938f16b0e6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page