A Python port of the bnlearn R package for Bayesian Network structure learning
Project description
bnlearn-py
A Python port of the popular R package bnlearn for Bayesian Network structure learning.
This project aims to provide a functionally identical implementation of the bnlearn Hill Climbing (hc) algorithm, ensuring exactly the same results as the R implementation on the same datasets.
Implementation Details
The implementation mirrors the logic of bnlearn's C backend:
- Algorithm: Hill Climbing with restarts and perturbation.
- Score: Discrete BIC (Bayesian Information Criterion), AIC, and Log-Likelihood.
- Search Strategy:
- Greedy search with specific operation ordering:
Add->Delete->Reverse. - Floating point tolerance matching R's machine epsilon behavior.
- cycle detection and invalid operation filtering.
- Greedy search with specific operation ordering:
Validation Results
We have conducting intensive verification by comparing the outputs of this Python implementation against the original R bnlearn package (v4.9) using identical datasets.
Method
- Reference Generation: Use R to generate synthetic data (Gaussian/Discrete) and learn a network using
hc(data, score="bic"). Export the learned arcs and the final score. - Reproduction: Load the same data in Python, run the ported
hcfunction. - Comparison: Assert exact equality of the arc set and floating-point equality of the score.
Summary
| Dataset | Type | Nodes | Observations | Structure Match | Score Match |
|---|---|---|---|---|---|
| Small Test | Discrete | 5 | 1000 | ✅ Exact | ✅ Exact |
| Alarm (Subset) | Discrete | 37 | 2000 | ✅ Exact | ✅ Exact |
(Note: "Exact" structure match means the sets of directed edges are identical. Score match is verified to 4+ decimal places).
Detailed Verification Logs
--- Comparing Results for: Small Network (5 Nodes) ---
Data shape: (1000, 5)
R Score: -2460.827068
R Arcs count: 5
Python Score: -2460.827068
Python Arcs count: 5
Python Execution Time: 0.0605s
Score Difference: 9.549694e-12
>> Scores MATCH ✅
>> Structures MATCH EXACTLY ✅
--- Comparing Results for: Large Network (37 Nodes, Subset) ---
Data shape: (2000, 37)
R Score: -23176.667311
R Arcs count: 47
Python Score: -23176.667311
Python Arcs count: 47
Python Execution Time: 19.7716s
Score Difference: 3.637979e-11
>> Scores MATCH ✅
>> Structures MATCH EXACTLY ✅
Usage
import pandas as pd
from bnlearn.learning import hc
from bnlearn.score import score_network
# Load your data (pandas DataFrame with categorical columns)
df = pd.read_csv("data.csv")
for col in df.columns:
df[col] = df[col].astype('category')
# Learn structure
bn = hc(df, score='bic')
# Inspect learned arcs
print(bn.arcs)
# Calculate score
print(score_network(bn, df, score_type='bic'))
Running Tests
To verify the equivalence yourself:
-
Install Dependencies:
pip install pandas numpy scipy
(You need R installed to regenerate reference data, but pre-generated data is included).
-
Generate Reference Data (Optional):
Rscript tests/generate_reference.R -
Run Equivalence Tests:
python3 -m unittest tests/test_equivalence.py
Performance
While strictly equivalent, this pure Python implementation is currently slower than the highly optimized C backend of bnlearn. It is intended for research, education, and environments where R is not available, or where Python-native integration is prioritized over raw speed for massive datasets.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bnlearn_py-0.1.0.tar.gz.
File metadata
- Download URL: bnlearn_py-0.1.0.tar.gz
- Upload date:
- Size: 12.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
23ff7cddd8762ce5b10ebb5c7d1fa76616d2a6f7685853b80d1389a1ac38d524
|
|
| MD5 |
b972bed15a72c973f934ca453e2c49e2
|
|
| BLAKE2b-256 |
e5a6572b1d44b1a0bdffcd39c147b2c1af30137adf83a772b30defaf2c95fc07
|
File details
Details for the file bnlearn_py-0.1.0-py3-none-any.whl.
File metadata
- Download URL: bnlearn_py-0.1.0-py3-none-any.whl
- Upload date:
- Size: 12.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4dad4477a161f31a0e5529c7688e5c33f9cf4320fd2bc215a2c1c6a6eb03061a
|
|
| MD5 |
4fc7db5546c2fa9387c6cb7b3bbeb6ee
|
|
| BLAKE2b-256 |
c73b78db13ce0d462e6e74b2f1b063d4199311614f675f977178891371da9d67
|