Max-Min Hill-Climbing (MMHC) Bayesian network structure learning algorithm
Project description
MMHC — Max-Min Hill-Climbing Algorithm
A Python implementation of the Max-Min Hill-Climbing (MMHC) Bayesian network structure learning algorithm.
Based on: Tsamardinos, Brown & Aliferis, "The max-min hill-climbing Bayesian network structure learning algorithm", Machine Learning, 2006. DOI: 10.1007/s10994-006-6889-7.
Installation
pip install -e ".[dev]"
Quick Start
from mmhc import MMHC, make_student, MMHCConfig
# Generate synthetic data from a known Bayesian network
data = make_student(5000, random_state=42)
# Learn the network structure
model = MMHC(config=MMHCConfig(random_seed=42))
result = model.fit(data)
# Inspect results
print(result.adjacency_matrix)
print(f"BDeu score: {result.score}")
print(f"Converged: {result.converged} in {result.n_iterations} iterations")
Or use the convenience function:
from mmhc import mmhc, make_student
data = make_student(5000, random_state=42)
result = mmhc(data)
Algorithm
The MMHC algorithm reconstructs Bayesian Networks from observational data in two phases:
Phase 1: MMPC (Max-Min Parents and Children)
Builds the undirected skeleton using conditional independence tests (G-test / chi-squared):
$$G = 2 \sum_{i,j} O_{ij} \ln\frac{O_{ij}}{E_{ij}}$$
The forward phase iteratively adds the most dependent variable to the candidate parent-children set. The backward phase removes variables that become conditionally independent given subsets of the current set. A symmetry constraint ensures consistency.
Phase 2: Hill-Climbing with BDeu Scoring
Directs edges using greedy hill-climbing with BDeu (Bayesian Dirichlet likelihood-equivalence uniform) scoring:
$$\text{BDeu}(X_i, \Pi_i) = \sum_{j=1}^{q_i} \left[ \ln\frac{\Gamma(\eta/q_i)}{\Gamma(N_{ij} + \eta/q_i)} + \sum_{k=1}^{r_i} \ln\frac{\Gamma(N_{ijk} + \eta/(q_i r_i))}{\Gamma(\eta/(q_i r_i))} \right]$$
Edge operations (add, reverse, delete) are applied greedily, with cycle detection to maintain DAG validity. Early stopping triggers after 5 non-improving rounds.
Configuration
| Parameter | Default | Description |
|---|---|---|
alpha |
0.05 | Significance level for conditional independence tests |
eta |
1.0 | Equivalent sample size for BDeu scoring |
max_iterations |
100 | Maximum hill-climbing iterations |
early_stop_rounds |
5 | Stop after N non-improving rounds |
random_seed |
None | Seed for reproducibility |
API Reference
Classes
MMHC(config=None)— Main class withfit(data),fit_predict(data)methodsMMHCConfig(...)— Configuration dataclassMMHCResult— Result dataclass withadjacency_matrix,parent_children,score,node_scores,n_iterations,converged,column_names
Functions
mmhc(data, config=None)— Convenience function for one-shot usagemake_student(n_samples, random_state)— Generate student network datamake_rainy(n_samples, random_state)— Generate sprinkler/rain network dataplot_dag(adjacency, labels, ax, title)— Visualize the learned DAG
Use Cases
- Medical diagnosis networks: Learn dependencies between symptoms, diseases, and test results from patient records
- Gene regulatory network inference: Discover gene interaction networks from expression data
- Supply chain dependency analysis: Identify causal relationships between supply chain variables
Running Tests
pytest --cov=src/mmhc tests/ -v
Key Improvements Over Original R/C++ Implementation
- No conditioning variable limit: Original supported max 3 conditioning variables; Python version handles arbitrary numbers via vectorized contingency tables
- Cycle detection: DAG constraint enforced during hill-climbing (original did not check)
- Reproducibility: Deterministic with seed (original used
srand(time(NULL))) - Configurable parameters: All hardcoded values now configurable via
MMHCConfig
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mmhc-2.0.0.tar.gz.
File metadata
- Download URL: mmhc-2.0.0.tar.gz
- Upload date:
- Size: 25.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
09f4c2c21d5e08f62b56df937c138a1a85469fdfebb6472bdb7967cebe77761f
|
|
| MD5 |
69a119bd9f555a8dced5f8f1f515c574
|
|
| BLAKE2b-256 |
2efc83da390ba941a2846ddfd32b3c6df671cd5e6a3404e5919381f0abc5e6f1
|
File details
Details for the file mmhc-2.0.0-py3-none-any.whl.
File metadata
- Download URL: mmhc-2.0.0-py3-none-any.whl
- Upload date:
- Size: 16.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a014a965e644f1476431a3c4a86c6c9802cbde8bce259cddad88b2a393caee38
|
|
| MD5 |
181e8b68a4757f89e216af1734dfeedb
|
|
| BLAKE2b-256 |
d3a3c4355b0fa910c83e6229a8b56e8d47ad87dfb2c5827dc086fe32d4994a02
|