Interpretable machine learning on graph-structured data using path-based boosting.

These details have not been verified by PyPI

Project links

Project description

Path Boost

Path Boost is a Python library for interpretable machine learning on graph-structured data. It implements the PathBoost and SequentialPathBoost algorithms, which iteratively construct features based on paths in graphs and use boosting to build predictive models. The library is designed for tasks where input data consists of collections of graphs (e.g., molecules, social networks) and supports variable importance analysis for interpretability.

Features

PathBoost: Ensemble learning over graph paths, partitioned by anchor nodes.
SequentialPathBoost: Boosting with path-based features, iteratively expanding the feature space.
Variable Importance: Quantifies the importance of paths/features in prediction.
Parallel Training: Supports multi-core training for large datasets.
Evaluation and Visualization: Built-in tools for error tracking and variable importance plotting.

Installation

Install from PyPI:

pip install path_boost

Usage Example

Below is a minimal example using the PathBoost model:

import numpy as np
import networkx as nx
from sklearn.model_selection import train_test_split
from path_boost import PathBoost
from path_boost.utils.datasets_for_examples.generate_example_dataset import generate_synthetic_graph_dataset




if __name__ == "__main__":

    # Generate synthetic dataset
    nx_graphs, y, true_paths, true_weights = generate_synthetic_graph_dataset()


    list_anchor_nodes_labels = [0, 1, 2]

    parameters_variable_importance: dict = {
        'criterion': 'absolute',
        'error_used': 'mse',
        'use_correlation': False,
        'normalize': True,
    }

    X_train, X_test, y_train, y_test = train_test_split(nx_graphs, y, test_size=0.25, random_state=42)

    eval_set = [(X_test, y_test)]

    path_boost = PathBoost(
        n_iter=50, # Reduced for quicker example run
        max_path_length=5,
        learning_rate=0.1,
        n_of_cores=1, # Set to >1 for parallel processing if desired
        verbose=True,
        parameters_variable_importance=parameters_variable_importance
    )

    # Fit the model
    # anchor_nodes_label_name must correspond to the feature storing node types ('feature_0')
    path_boost.fit(
        X=X_train,
        y=y_train,
        eval_set=eval_set,
        list_anchor_nodes_labels=list_anchor_nodes_labels,
        anchor_nodes_label_name="feature_0" # Node types are in 'feature_0'
    )
    
    print(f"Generated {len(nx_graphs)} graphs.")
    print(f"Example y values: {y[:5]}")
    print(f"True paths definitions: {true_paths}")
    print(f"True path weights: {true_weights}")

    path_boost.plot_training_and_eval_errors(skip_first_n_iterations=0, plot_eval_sets_error=True)
    if path_boost.parameters_variable_importance is not None and hasattr(path_boost, 'variable_importance_'):
        path_boost.plot_variable_importance(top_n_features=10)
    else:
        print("Variable importance not computed or available.")

    print("Example run finished.")

API Overview

PathBoost

fit(X, y, anchor_nodes_label_name, list_anchor_nodes_labels, eval_set=None)
predict(X)
predict_step_by_step(X)
evaluate(X, y)
plot_training_and_eval_errors(skip_first_n_iterations=True)
plot_variable_importance()
Attributes:
- train_mse_: Training error (MSE) at each iteration
- mse_eval_set_: Evaluation set error (MSE) at each iteration (if eval_set is provided)
- variable_importance_: Variable/path importance scores (if enabled)
- is_fitted_: Whether the model is fitted
- models_list_: List of fitted SequentialPathBoost models (one per anchor node)
- (Each SequentialPathBoost in models_list_ exposes the attributes below)

SequentialPathBoost

fit(X, y, list_anchor_nodes_labels, name_of_label_attribute, eval_set=None)
predict(X)
predict_step_by_step(X)
evaluate(X, y)
plot_training_and_eval_errors(skip_first_n_iterations=True)
plot_variable_importance()
Attributes:
- train_mse_: Training error (MSE) at each iteration
- train_mae_: Training MAE at each iteration
- eval_sets_mse_: Evaluation set error (MSE) at each iteration (if eval_set is provided)
- eval_sets_mae_: Evaluation set MAE at each iteration (if eval_set is provided)
- variable_importance_: Variable/path importance scores (if enabled)
- paths_selected_by_epb_: Set of selected paths during boosting
- columns_names_: Names of EBM columns/features used
- is_fitted_: Whether the model is fitted

Requirements

Python 3.10+
numpy
pandas
scikit-learn
networkx
matplotlib

(See requirements.txt for the full list.)

Citation

If you use this library in your research, please cite the corresponding paper (add citation here).

License

BSD 3-Clause License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

2.1.0

May 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

path_boost-2.1.0.tar.gz (177.7 kB view details)

Uploaded May 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

path_boost-2.1.0-py3-none-any.whl (61.6 kB view details)

Uploaded May 26, 2026 Python 3

File details

Details for the file path_boost-2.1.0.tar.gz.

File metadata

Download URL: path_boost-2.1.0.tar.gz
Upload date: May 26, 2026
Size: 177.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.6

File hashes

Hashes for path_boost-2.1.0.tar.gz
Algorithm	Hash digest
SHA256	`65acc2dac849cef16c327d4da307efe0517dbcba1b6e8a4623aa7285ad09dc2a`
MD5	`9edf32166327411c1dca5c62d2ae532d`
BLAKE2b-256	`86e209c8bda4c70d4b3169271b6e74b92083948dde49da77c06e41a1ae468006`

See more details on using hashes here.

File details

Details for the file path_boost-2.1.0-py3-none-any.whl.

File metadata

Download URL: path_boost-2.1.0-py3-none-any.whl
Upload date: May 26, 2026
Size: 61.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.6

File hashes

Hashes for path_boost-2.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9780ead44306cf4753091051b169d55419f70c1864fc42804758d182d66e4f16`
MD5	`10b76c1bba780f895c805f5d9700e00e`
BLAKE2b-256	`8b00f479524af1e336106b5dffbfb31f0bc7aa3bdf4f1757e13e9a0364f9f516`

See more details on using hashes here.

path-boost 2.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Path Boost

Features

Installation

Usage Example

API Overview

PathBoost

SequentialPathBoost

Requirements

Citation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes