Skip to main content

PyBNesian is a Python package that implements Bayesian networks. PyBNesian allows extending itsfunctionality using Python code, so new research can be easily developed.

Project description

build Documentation Status PyPI

PyBNesian

  • PyBNesian is a Python package that implements Bayesian networks. Currently, it is mainly dedicated to learning Bayesian networks.

  • PyBNesian is implemented in C++, to achieve significant performance gains. It uses Apache Arrow to enable fast interoperability between Python and C++. In addition, some parts are implemented in OpenCL to achieve GPU acceleration.

  • PyBNesian allows extending its functionality using Python code, so new research can be easily developed.

Implementation

Currently PyBNesian implements the following features:

Models

  • Bayesian networks.

  • Conditional Bayesian networks (see section 5.6 of [1]).

  • Dynamic Bayesian networks.

which can have different types of CPDs:

  • Multinomial.

  • Linear Gaussian.

  • Conditional kernel density estimation (ratio of two kernel density estimation models). Accelerated with OpenCL.

with this combinations of CPDs, we implement the following types of networks (which can also be Conditional or Dynamic):

  • Discrete networks.

  • Gaussian networks.

  • Semiparametric networks.

  • Hybrid networks (conditional linear Gaussian networks and semiparametric networks).

Graphs

  • DAGs.

  • Directed graphs.

  • Undirected graphs.

  • Partially directed graphs.

Graph classes implement useful functionalities for probabilistic graphical models, such as moving between DAG-PDAG representation or fast access to root and leaves.

Learning

It implements different structure learning algorithms:

  • Greedy hill-climbing (for Bayesian networks and Conditional Bayesian networks).

  • PC-stable (for Bayesian networks and Conditional Bayesian networks).

  • MMPC (for Bayesian networks and Conditional Bayesian networks).

  • MMHC (for Bayesian networks and conditional Bayesian networks).

  • DMMHC (for dynamic Bayesian networks).

The score and search algorithms can be used with the following scores:

  • BIC.

  • BGe.

  • BDe.

  • Cross-validation likelihood.

  • Holdout likelihood.

  • Cross-validated likelihood with validation dataset. This score combines the cross-validation likelihood with a validation dataset to control the overfitting.

and the following the following learning operators:

  • Arc operations: add arc, remove arc, flip arc.

  • Change Node Type (for semiparametric Bayesian networks).

The following independence tests are implemented for the constraint-based algorithms:

  • Chi-square test.

  • partial correlation test t-test.

  • A likelihood-ratio test based on mutual information assuming a Gaussian distribution for the continuous data.

  • CMIknn [2].

  • RCoT [3].

It also implements the parameter learning:

  • Maximum Likelihood Estimator.

Inference

Not implemented right now, as the priority is the learning algorithms. However, all the CPDs and models have a sample() method, which can be used to create easily an approximate inference engine based on sampling.

Serialization

All relevant objects (graphs, CPDs, Bayesian networks, etc) can be saved/loaded using the pickle format.

Other implementations

PyBNesian exposes the implementation of other models or techniques used within the library.

  • Apply cross-validation to a dataset.

  • Apply holdout to a dataset.

  • Kernel Density Estimation. Accelerated with OpenCL.

  • K-d Tree. (implemented but not exposed yet).

Weighted sums of chi-squared random variables:

  • Hall-Buckley-Eagleson approximation. (implemented but not exposed yet).

  • Lindsay-Pilla-Basak approximation. (implemented but not exposed yet).

Usage example

from pybnesian.models import GaussianNetwork, GaussianNetworkType
from pybnesian.factors.continuous import LinearGaussianCPD
# Create a GaussianNetwork with 4 nodes and no arcs.
gbn = GaussianNetwork(['a', 'b', 'c', 'd'])
# Create a GaussianNetwork with 4 nodes and 3 arcs.
gbn = GaussianNetwork(['a', 'b', 'c', 'd'], [('a', 'c'), ('b', 'c'), ('c', 'd')])

# Return the nodes of the network.
print("Nodes: " + str(gbn.nodes()))
# Return the arcs of the network.
print("Arcs: " + str(gbn.arcs()))
# Return the parents of c.
print("Parents of c " + str(gbn.parents('c')))
# Return the children of c.
print("Children of c " + str(gbn.children('c')))

# You can access to the graph of the network.
graph = gbn.graph()
# Return the roots of the graph.
print("Roots: " + str(graph.roots()))
# Return the leaves of the graph.
print("Leaves: " + str(graph.leaves()))
# Return the topological sort.
print("Topological sort: " + str(graph.topological_sort()))

# Add an arc.
gbn.add_arc('a', 'b')
# Flip (reverse) an arc.
gbn.flip_arc('a', 'b')
# Remove an arc.
gbn.remove_arc('b', 'a')

# We can also add nodes.
gbn.add_node('e')
# We can get the number of nodes
assert gbn.num_nodes() == 5
# ... and the number of arcs
assert gbn.num_arcs() == 3
# Remove a node.
gbn.remove_node('b')

# Each node has an unique index to identify it
print("Indices: " + str(gbn.indices()))
idx_a = gbn.index('a')

# And we can get the node name from the index
print("Node 2: " + str(gbn.name(2)))

# The model is not fitted right now.
assert gbn.fitted() == False

# Create a LinearGaussianCPD (variable, parents, betas, variance)
d_cpd = LinearGaussianCPD("d", ["c"], [3, 1.2], 0.5)

# Add the CPD to the GaussianNetwork
gbn.add_cpds([d_cpd])

# The CPD is still not fitted because there are 3 nodes without CPD.
assert gbn.fitted() == False

# Let's generate some random data to fit the model.
import numpy as np
import pandas as pd
DATA_SIZE = 100
a_array = np.random.normal(3, np.sqrt(0.5), size=DATA_SIZE)
c_array = -4.2 - 1.2*a_array + np.random.normal(0, np.sqrt(0.75), size=DATA_SIZE)
d_array = 3 + 1.2 * c_array + np.random.normal(0, np.sqrt(0.5), size=DATA_SIZE)
e_array = np.random.normal(0, 1, size=DATA_SIZE)
df = pd.DataFrame({'a': a_array,
                   'c': c_array,
                   'd': d_array,
                   'e': e_array
                })

# Fit the model. You can pass a pandas.DataFrame or a pyarrow.RecordBatch as argument.
# This fits the remaining CPDs
gbn.fit(df)
assert gbn.fitted() == True

# Check the learned CPDs.
print(gbn.cpd('a'))
print(gbn.cpd('c'))
print(gbn.cpd('d'))
print(gbn.cpd('e'))

# You can sample some data
sample = gbn.sample(50)

# Compute the log-likelihood of each instance
ll = gbn.logl(sample)
# or the sum of log-likelihoods.
sll = gbn.slogl(sample)
assert np.isclose(ll.sum(), sll)

# Save the model, include the CPDs in the file.
gbn.save('test', include_cpd=True)

# Load the model
from pybnesian import load
loaded_gbn = load('test.pickle')

# Learn the structure using greedy hill-climbing.
from pybnesian.learning.algorithms import hc
learned = hc(df, bn_type=GaussianNetworkType())
print("Learned arcs: " + str(learned.arcs()))

Dependencies

  • Python 3.6, 3.7, 3.8 and 3.9.

The library has been tested on Ubuntu 16.04/20.04 and Windows 10, but should be compatible with other operating systems.

Libraries

The library depends on NumPy, Apache Arrow, and pybind11.

Installation

PyBNesian can be installed with pip:

pip install pybnesian

Build from Source

Prerequisites

  • Python 3.6, 3.7, 3.8 or 3.9.
  • C++17 compatible compiler.
  • OpenCL 1.2 headers/library available.

If needed you can select a C++ compiler by setting the environment variable CC. For example, in Ubuntu, we can use Clang 11 with the following command before installing PyBNesian:

export CC=clang-11

Building

Clone the repository:

git clone https://github.com/davenza/PyBNesian.git
cd PyBNesian
git checkout v0.1.0 # You can checkout a specific version if you want
python setup.py install

Testing

The library contains tests that can be executed using pytest. They also require scipy installed.

pip install pytest scipy

Run the tests with:

pytest

References

[1] D. Koller and N. Friedman, Probabilistic Graphical Models: Principles and Techniques, The MIT Press, 2009.

[2] J. Runge, Conditional independence testing based on a nearest-neighbor estimator of conditional mutual information. International Conference on Artificial Intelligence and Statistics, AISTATS 2018, 84, 2018, pp. 938–947.

[3] E. V. Strobl and K. Zhang and S., Visweswaran. Approximate kernel-based conditional independence tests for fast non-parametric causal discovery. Journal of Causal Inference, 7(1), 2019, pp 1-24.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pybnesian-0.2.1.tar.gz (4.5 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pybnesian-0.2.1-cp39-cp39-win_amd64.whl (7.2 MB view details)

Uploaded CPython 3.9Windows x86-64

pybnesian-0.2.1-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (13.5 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.12+ x86-64

pybnesian-0.2.1-cp39-cp39-macosx_10_14_x86_64.whl (2.5 MB view details)

Uploaded CPython 3.9macOS 10.14+ x86-64

pybnesian-0.2.1-cp38-cp38-win_amd64.whl (7.1 MB view details)

Uploaded CPython 3.8Windows x86-64

pybnesian-0.2.1-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (13.4 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.12+ x86-64

pybnesian-0.2.1-cp38-cp38-macosx_10_14_x86_64.whl (2.5 MB view details)

Uploaded CPython 3.8macOS 10.14+ x86-64

pybnesian-0.2.1-cp37-cp37m-win_amd64.whl (7.1 MB view details)

Uploaded CPython 3.7mWindows x86-64

pybnesian-0.2.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (13.5 MB view details)

Uploaded CPython 3.7mmanylinux: glibc 2.12+ x86-64

pybnesian-0.2.1-cp37-cp37m-macosx_10_14_x86_64.whl (2.4 MB view details)

Uploaded CPython 3.7mmacOS 10.14+ x86-64

pybnesian-0.2.1-cp36-cp36m-win_amd64.whl (7.1 MB view details)

Uploaded CPython 3.6mWindows x86-64

pybnesian-0.2.1-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (13.5 MB view details)

Uploaded CPython 3.6mmanylinux: glibc 2.12+ x86-64

pybnesian-0.2.1-cp36-cp36m-macosx_10_14_x86_64.whl (2.4 MB view details)

Uploaded CPython 3.6mmacOS 10.14+ x86-64

File details

Details for the file pybnesian-0.2.1.tar.gz.

File metadata

  • Download URL: pybnesian-0.2.1.tar.gz
  • Upload date:
  • Size: 4.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.10

File hashes

Hashes for pybnesian-0.2.1.tar.gz
Algorithm Hash digest
SHA256 775977ff5154cf05096ff13898c649ff5ce389c24fd9559f946f800f64feaa29
MD5 af6885983284e46d024553a5b512dded
BLAKE2b-256 acec296ab741af364f6962efe27ec3a9ae996e9cf895f55d06febbc8c7356944

See more details on using hashes here.

File details

Details for the file pybnesian-0.2.1-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: pybnesian-0.2.1-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 7.2 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.10

File hashes

Hashes for pybnesian-0.2.1-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 687589acc7bf02ca8ed904b07e773d440bb9e4fe96ddc429a334405e7d90138f
MD5 65f8fd17f6db7404e11c5e95accd25c5
BLAKE2b-256 5961741d4a1eb30756bba7904523d48d77edfd3b03a8d3752b3021b29f61a184

See more details on using hashes here.

File details

Details for the file pybnesian-0.2.1-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for pybnesian-0.2.1-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 b74dd8aed431450652b0bedf7aa46b82d6f813e385d4f6ac935848f73fb2b698
MD5 0d49a9bc2031199a922102f8049c6f0f
BLAKE2b-256 012c51493c497c087e382648f9cf6aac207f03bfc9a4efdf33d8f1acefb21c12

See more details on using hashes here.

File details

Details for the file pybnesian-0.2.1-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pybnesian-0.2.1-cp39-cp39-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 2.5 MB
  • Tags: CPython 3.9, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.10

File hashes

Hashes for pybnesian-0.2.1-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 bc8d8b790f07a1b4ecd4983757b5885978db0ac55d23d8318cb24249ce46beca
MD5 87b476829e31571e65c5b37bc4e2f787
BLAKE2b-256 b9aaae06b53af479e7f3d089768f3ae80267794c4200476b529c5783dcd30ccf

See more details on using hashes here.

File details

Details for the file pybnesian-0.2.1-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: pybnesian-0.2.1-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 7.1 MB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.10

File hashes

Hashes for pybnesian-0.2.1-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 d8366971bf279f1222a86a72479a84ca715bd5aa5d6bde2f69be372b13a5a5de
MD5 fc901e8aff575575449f3f6ed56fd012
BLAKE2b-256 7ed176fcbf3a99806032b703e57c4697d7a7efdf526b576e59092d0f1ec22fcc

See more details on using hashes here.

File details

Details for the file pybnesian-0.2.1-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for pybnesian-0.2.1-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 31b7cdedea1a68115f04bbb571389a107eb6bdddf86cec5bac6f84af014896aa
MD5 e02e248b0eb7e5f5bbad991e2bcb6368
BLAKE2b-256 047cf085248bc182e06ef06cae0d92542a169fbefaf31fb4edf1461e14417737

See more details on using hashes here.

File details

Details for the file pybnesian-0.2.1-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pybnesian-0.2.1-cp38-cp38-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 2.5 MB
  • Tags: CPython 3.8, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.10

File hashes

Hashes for pybnesian-0.2.1-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 88ecb1f0e2b141ede991c07dd325ef0d624f8214d7e3aa1eb87b218f1a4e9fdf
MD5 9c1fb8ce6ffc88198036a6b703423613
BLAKE2b-256 ccae4aca85d82b0b725e502c2bcb4c111624d110cb98c68295db730292a6c018

See more details on using hashes here.

File details

Details for the file pybnesian-0.2.1-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: pybnesian-0.2.1-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 7.1 MB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.10

File hashes

Hashes for pybnesian-0.2.1-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 55b7ee57d6db52b0c6687daa10ba3451afc3771ec20fd7f2f9b879a8c0b7c263
MD5 135f79dba28bfe7c3b9bd7dfd3ef21d1
BLAKE2b-256 9630a8899b4d5fdc72776b5d9d2fc4d2480e5ac1db858b4cd3f2947fb27f5e17

See more details on using hashes here.

File details

Details for the file pybnesian-0.2.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for pybnesian-0.2.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 d8e2f2b1ac60ffb6d63d0fd4b02b01e67fa56c2b94c635b08714da0b535ec96c
MD5 0f4d28215102048094183c6f716e123a
BLAKE2b-256 71645ffe8dd9f99183a604ec07824bc84a9ff996d6c41a8e383c1489a6644700

See more details on using hashes here.

File details

Details for the file pybnesian-0.2.1-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pybnesian-0.2.1-cp37-cp37m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 2.4 MB
  • Tags: CPython 3.7m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.10

File hashes

Hashes for pybnesian-0.2.1-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 5be69384f308724d340f1bad5461f03f2ae25a12b2d3c9b2a0d56b921fb34df5
MD5 4cfdd633b73640f310c9027f2de062e4
BLAKE2b-256 a42103a2b27ced677a241540f14e63a908b1fe5f00325e947f2c91724314a5d1

See more details on using hashes here.

File details

Details for the file pybnesian-0.2.1-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: pybnesian-0.2.1-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 7.1 MB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.10

File hashes

Hashes for pybnesian-0.2.1-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 ab00dc1a749586ac6f67f74cf0bc4f4d42168f21ab30ac7bbf9aac499e05b631
MD5 07ef550dfdfecd8fe256a3c884ae042e
BLAKE2b-256 eda39766ec7bb47581424beaae504f92ef38c0160fdb40f7211256aca2acd9ae

See more details on using hashes here.

File details

Details for the file pybnesian-0.2.1-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for pybnesian-0.2.1-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 5acc6f8d2a764fc04d58e6959c4c401417f97f04c911a7b079cb31bdffe2825a
MD5 55acd6b3cd82d33976fb83658f34f06f
BLAKE2b-256 b15ba5913979a742813e49a7b664016d2fa9b22ee7061ee87891d4c637a8a4b4

See more details on using hashes here.

File details

Details for the file pybnesian-0.2.1-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pybnesian-0.2.1-cp36-cp36m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 2.4 MB
  • Tags: CPython 3.6m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.10

File hashes

Hashes for pybnesian-0.2.1-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 dbe4d310f6995be1cf0bc25f5aa770b23a076158cdcd4509a16519f3e97ea837
MD5 79f641e1ef17a327d9b22b529a5dd30e
BLAKE2b-256 de00cc6c31f22ae11fde46ff374b6425b85f8d17335d2c301d29050e2d30d344

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page