Skip to main content

A Python port of the bnlearn R package for Bayesian Network structure learning

Project description

bnlearn-py

A Python port of the popular R package bnlearn for Bayesian Network structure learning, optimized with JAX.

This project provides a functionally identical implementation of the bnlearn Hill Climbing (hc) algorithm, ensuring exactly the same results as the R implementation on the same datasets.

Implementation Details

The implementation mirrors the logic of bnlearn's C backend:

  • Algorithm: Hill Climbing with restarts and perturbation.
  • Score: Discrete BIC, AIC, and Log-Likelihood.
  • Optimization: JAX-accelerated batched scoring using vmap and JIT.
  • Search Strategy:
    • Greedy search with specific operation ordering: Add -> Delete -> Reverse.
    • Floating point tolerance matching R's machine epsilon behavior.
    • Exact matching of R's loop order to ensure identical results in case of ties.

Validation Results

We have conducted intensive verification by comparing the outputs of this Python implementation against the original R bnlearn package (v4.9) using identical datasets.

Summary

Dataset Type Nodes Observations Structure Match Score Match
Small Test Discrete 5 1000 Exact Exact
Alarm (Subset) Discrete 37 2000 Exact Exact

(Note: "Exact" structure match means the sets of directed edges are identical. Score match is verified to double precision limits).

Detailed Verification Logs (JAX Optimized)

--- Comparing Results for: Small Network (5 Nodes) ---
Data shape: (1000, 5)
R Score: -2460.827068
Python Score: -2460.827068
>> Scores MATCH ✅
>> Structures MATCH EXACTLY ✅

--- Comparing Results for: Large Network (37 Nodes, Subset) ---
Data shape: (2000, 37)
R Score: -23176.667311
Python Score: -23176.667311
Python Execution Time: ~57s (including JIT compilation)
>> Scores MATCH ✅
>> Structures MATCH EXACTLY ✅

Usage

import pandas as pd
from bnlearn.learning import hc
from bnlearn.score import score_network

# Load your data (pandas DataFrame with categorical columns)
df = pd.read_csv("data.csv")
for col in df.columns:
    df[col] = df[col].astype('category')

# Learn structure
bn = hc(df, score='bic')

# Inspect learned arcs
print(bn.arcs)

# Calculate score
print(score_network(bn, df, score_type='bic'))

Running Tests

To verify the equivalence:

  1. Install Dependencies:

    pip install .
    
  2. Run Equivalence Tests:

    PYTHONPATH=src python3 -m unittest tests/test_equivalence.py
    

Performance

This package leverages JAX for high-performance score calculation:

  • Vectorized Scoring: Using jax.vmap, we evaluate all possible edge candidates in batches.
  • XLA Compilation: Scoring kernels are JIT-compiled for machine-code execution.
  • Precision: 64-bit precision is enabled (jax_enable_x64) for numerical parity with R.
  • Packaging: Available on PyPI as bnlearn-py.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bnlearn_py-0.1.1.tar.gz (13.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bnlearn_py-0.1.1-py3-none-any.whl (13.0 kB view details)

Uploaded Python 3

File details

Details for the file bnlearn_py-0.1.1.tar.gz.

File metadata

  • Download URL: bnlearn_py-0.1.1.tar.gz
  • Upload date:
  • Size: 13.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for bnlearn_py-0.1.1.tar.gz
Algorithm Hash digest
SHA256 38bae1b0a6fa0cce418682139854601a3f84bdfecf4ef61e1a1f292ff7bc71d3
MD5 2e2cf5d08682248672200b05d314b3e0
BLAKE2b-256 2eecb8ffb85a0cdcc74370637afcec337c9c32f3e17f2be0a066e5e64fdb46ef

See more details on using hashes here.

File details

Details for the file bnlearn_py-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: bnlearn_py-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 13.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for bnlearn_py-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 19b3f5509cdd78e5640eaf1d9e438895909d84dd97542fbe7452a516432173e8
MD5 5b1892a86fa9797913a237e4ed53d128
BLAKE2b-256 320194d81bc6f122dfac4b8dcef17aaa89a96ffd82fff22aa3d9b9dabb14fae9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page