Skip to main content

A Python port of the bnlearn R package for Bayesian Network structure learning

Project description

bnlearn-py

A Python port of the popular R package bnlearn for Bayesian Network structure learning.

This project aims to provide a functionally identical implementation of the bnlearn Hill Climbing (hc) algorithm, ensuring exactly the same results as the R implementation on the same datasets.

Implementation Details

The implementation mirrors the logic of bnlearn's C backend:

  • Algorithm: Hill Climbing with restarts and perturbation.
  • Score: Discrete BIC (Bayesian Information Criterion), AIC, and Log-Likelihood.
  • Search Strategy:
    • Greedy search with specific operation ordering: Add -> Delete -> Reverse.
    • Floating point tolerance matching R's machine epsilon behavior.
    • cycle detection and invalid operation filtering.

Validation Results

We have conducting intensive verification by comparing the outputs of this Python implementation against the original R bnlearn package (v4.9) using identical datasets.

Method

  1. Reference Generation: Use R to generate synthetic data (Gaussian/Discrete) and learn a network using hc(data, score="bic"). Export the learned arcs and the final score.
  2. Reproduction: Load the same data in Python, run the ported hc function.
  3. Comparison: Assert exact equality of the arc set and floating-point equality of the score.

Summary

Dataset Type Nodes Observations Structure Match Score Match
Small Test Discrete 5 1000 Exact Exact
Alarm (Subset) Discrete 37 2000 Exact Exact

(Note: "Exact" structure match means the sets of directed edges are identical. Score match is verified to 4+ decimal places).

Detailed Verification Logs

--- Comparing Results for: Small Network (5 Nodes) ---
Data shape: (1000, 5)
R Score: -2460.827068
R Arcs count: 5
Python Score: -2460.827068
Python Arcs count: 5
Python Execution Time: 0.0605s
Score Difference: 9.549694e-12
>> Scores MATCH ✅
>> Structures MATCH EXACTLY ✅

--- Comparing Results for: Large Network (37 Nodes, Subset) ---
Data shape: (2000, 37)
R Score: -23176.667311
R Arcs count: 47
Python Score: -23176.667311
Python Arcs count: 47
Python Execution Time: 19.7716s
Score Difference: 3.637979e-11
>> Scores MATCH ✅
>> Structures MATCH EXACTLY ✅

Usage

import pandas as pd
from bnlearn.learning import hc
from bnlearn.score import score_network

# Load your data (pandas DataFrame with categorical columns)
df = pd.read_csv("data.csv")
for col in df.columns:
    df[col] = df[col].astype('category')

# Learn structure
bn = hc(df, score='bic')

# Inspect learned arcs
print(bn.arcs)

# Calculate score
print(score_network(bn, df, score_type='bic'))

Running Tests

To verify the equivalence yourself:

  1. Install Dependencies:

    pip install pandas numpy scipy
    

    (You need R installed to regenerate reference data, but pre-generated data is included).

  2. Generate Reference Data (Optional):

    Rscript tests/generate_reference.R
    
  3. Run Equivalence Tests:

    python3 -m unittest tests/test_equivalence.py
    

Performance

While strictly equivalent, this pure Python implementation is currently slower than the highly optimized C backend of bnlearn. It is intended for research, education, and environments where R is not available, or where Python-native integration is prioritized over raw speed for massive datasets.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bnlearn_py-0.1.0.tar.gz (12.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bnlearn_py-0.1.0-py3-none-any.whl (12.3 kB view details)

Uploaded Python 3

File details

Details for the file bnlearn_py-0.1.0.tar.gz.

File metadata

  • Download URL: bnlearn_py-0.1.0.tar.gz
  • Upload date:
  • Size: 12.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for bnlearn_py-0.1.0.tar.gz
Algorithm Hash digest
SHA256 23ff7cddd8762ce5b10ebb5c7d1fa76616d2a6f7685853b80d1389a1ac38d524
MD5 b972bed15a72c973f934ca453e2c49e2
BLAKE2b-256 e5a6572b1d44b1a0bdffcd39c147b2c1af30137adf83a772b30defaf2c95fc07

See more details on using hashes here.

File details

Details for the file bnlearn_py-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: bnlearn_py-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 12.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for bnlearn_py-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4dad4477a161f31a0e5529c7688e5c33f9cf4320fd2bc215a2c1c6a6eb03061a
MD5 4fc7db5546c2fa9387c6cb7b3bbeb6ee
BLAKE2b-256 c73b78db13ce0d462e6e74b2f1b063d4199311614f675f977178891371da9d67

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page