Skip to main content

First version of the PyMELib (Python Minimal Enumeration Library) library

Project description

PyMELib (Python Minimal Enumeration Library) - A Python Library for Fixed-Parameter-Linear Delay Enumeration of Minimal Dominating Sets and Minimal Hitting Sets

This repository houses a Python implementation of the algorithm presented in the research paper "Enumeration of Minimal Hitting Sets Parameterized by Treewidth" by Kenig and Mizrahi (2024) [1]. The library aims to provide an efficient way to enumerate minimal hitting sets (and minimal dominating sets) of hypergraphs (of graphs), leveraging the theoretical guarantees outlined in the paper, particularly focusing on fixed-parameter tractability with respect to treewidth. This library also includes an implementation of the novel disjoint branch nice tree decomposition data structure.

Introduction

Minimal Hitting Set (MHS) enumeration is a fundamental problem with applications in various domains like databases, AI, bioinformatics, and constraint satisfaction. Given a hypergraph (a collection of sets), the goal is to find all minimal subsets of the hypergraph's vertex set that have a non-empty intersection with every hyperedge (set in the collection). While generally it is not known if the problem has computationally tractable delay-time solution, the algorithm by Kenig and Mizrahi provides an efficient approach for hypergraphs of bounded treewidth, achieving fixed-parameter-linear delay after an FPT preprocessing phase. This library implements that approach.

Files

  • PyMELib/
    • utils/: Directory containing helper functions for:
      • readHypergraphFromFile.py: Reading input files and transforming them to the reduction graph.
      • addConstraints.py: Adding constraints on vertices inclusion/exclusion in the output.
      • graph_utils.py: General graph utilities.
      • comb_utils.py: Combinatorial utilities.
      • labels_utils.py: Utilities related to labels.
    • EnumerationAlgorithms.py: Contains the enumeration algorithms.
    • TreeDecompositions.py: Handles tree decomposition related tasks.
    • PreprocessingAlgorithms.py: Contains preprocessing algorithms.
    • Factors.py: Contains classes used as factors.
    • labels.py: Handling Labels' definition and logic as defined in [1].
  • tests/: Contains unit tests for various components (e.g., test_basicEnumDSAlgo.py, test_basicEnumHSAlgo.py).
  • setup.py: Build and installation script.

Functionality

The following list include the main functionalities of the library, more detailed descriptions can be found in the rest of this README and inside the docstrings of the code:
  • Tree Decompositions: Includes different classes for different types of tree decompositions and functions to work with them. You can look at TreeDecompositions.py which includes:
    1. RootedTreeDecomposition: Class that represents some rooted junction tree of a graph. Its __init__ gets a NetworkX graph and construct a rooted tree decomposition of it using NetworkX.junction_tree. This class also includes a basic visualisation option of the tree decomposition, self.draw.
    2. RootedNiceTreeDecomposition: Class that represents a nice tree decomposition (it inherits from RootedNiceTreeDecomposition). It has two modes for "niceness": 1. regular "niceness" - when semi_nice=False it transforms the regular rooted tree decomposition to a nice one using the function self.transform_to_nice_rec. 2. semi-nice "niceness" - when semi_nice=True it transforms the regular rooted tree decomposition to another one using self.transform_to_semi_nice_rec, if the rooted tree decomposition can be transformed into a disjoint branch one, it will transform it, if not, the function will transform it to the closest tree to a disjoint one, i.e. the tree won;t have the same vertex in both branches if it doesn't have to be there. Additionally, this class also provide a complete order of the vertices Q as described in [1].
    3. NodeType: This IntEnum represents the types of nodes (bags) in nice tree decomposition, i.e. Forget, Join, ...
    4. RootedDisjointBranchNiceTreeDecomposition: This new data structure introduced in [1] is implemented using this class. Like before it includes a regular option using semi_dntd = False and a "semi" one using semi_dntd = True.
  • Preprocessing: Implements necessary preprocessing algorithm as described in [1], constructing the appropriate factors before enumeration for the RootedDisjointBranchNiceTreeDecomposition. You can look at PreprocessingAlgorithms.py which includes:
    1. create_factors(td: RootedDisjointBranchNiceTreeDecomposition): Creates empty factors for the nodes of the TD before the preprocessing phase.
    2. calculate_factors_for_mds_enum(td: RootedDisjointBranchNiceTreeDecomposition, current_node: int, options_for_labels=False): This is a dynamic programming algorithm that calculates the factors of the TD as described in [1].
    3. calculate_factors_for_mds_enum_iterative(td: RootedDisjointBranchNiceTreeDecomposition, options_for_labels=False): This is a for loop version of calculate_factors_for_mds_enum, using a stack. Much better in terms of both memory and time consumption.
  • Core Enumeration Algorithms: The main implementation of the minimal dominating set enumeration with fixed-parameter-linear delay. You can look at EnumerationAlgorithms.py which includes:
    1. IsExtendable(td: RootedDisjointBranchNiceTreeDecomposition, theta, i): IsExtendable procedure described in [1].
    2. IncrementLabeling(td: RootedDisjointBranchNiceTreeDecomposition, theta: Dict[str, Label], i, c: Label, V_label_S, V_label_W): An efficient version of IncrementLabeling procedure described in [1].
    3. EnumMDS(td: RootedDisjointBranchNiceTreeDecomposition, theta: Dict[str, Label] = dict(), i=0, debug_flag=False, options_for_labels=False): The algorithm for minimal dominating sets enumeration described in [1]. Theoretically the time delay is bounded by $O(nw)$, $n$ being the number of nodes in the graph and $w$ is the graph's treewidth.
    4. EnumMDS_iterative(td: RootedDisjointBranchNiceTreeDecomposition, debug_flag=False, options_for_labels=False): This is a for loop version of EnumMDS, using a stack.
    5. EnumMHS(td: RootedDisjointBranchNiceTreeDecomposition, theta: Dict[str, Label] = dict(), i=0, debug_flag=False): This is a version of EnumMDS designated for minimal hitting set enumeration.
    6. EnumMHS_iterative(td: RootedDisjointBranchNiceTreeDecomposition, debug_flag=False): This is a for loop version of EnumMHS, using a stack.
  • Input: Utilities for reading hypergraph data in the input format of the dualization repository [2], constructing the proper reduction to enumeration of minimal dominating sets described in [1] and adding other inclusion/exclusion constraints on the problem. You can look at utils/readHypergraphFromFile.py and utils/addConstraints.py which include:
    1. add_constraints_on_graph(G: nx.Graph, include_in_ds: Iterable = [], exclude_from_ds: Iterable = []): This function is a helper function to add constraints on a graph, before the running of the preprocessing phase or the enumeration phase. There are two kinds of constraints: include_in_ds and exclude_from_ds. After using this function, remember to call the preprocessing phase and the enumeration phase with options_for_labels=True.
    2. read_hypergraph(path_to_file: str) -> nx.Graph: Reads a hypergraph from a file in the designated format (see Data - Input Format) and transform it to the reduction graph described in [1].

Getting Started

Prerequisites

  • Python 3.9.6 or higher (as specified by you)
  • Dependencies:
    • NetworkX
    • Matplotlib
    • EoN (Epidemics on Networks) [3]
    • Plotly
    • tqdm

Installation

Using pip: pip install PyMELib

or clone the repository and install it locally:

  1. Clone the repository:
    git clone https://github.com/Dan551Mizrahi/PyMELib.git
    cd PyMELib
  2. Install the package (preferably in a virtual environment):
    pip install .
    or for development:
    pip install -e .

Running

Import the library contents:


from PyMELib.TreeDecompositions import RootedDisjointBranchNiceTreeDecomposition
from PyMELib.PreprocessingAlgorithms import create_factors, calculate_factors_for_mds_enum_iterative
from PyMELib.EnumerationAlgorithms import EnumMHS
from PyMELib.utils.readHypergraphFromFile import read_hypergraph from PyMELib.utils.addConstraints import add_constraints_on_graph

Read your hypergraph:


H = read_hypergraph("path/to/your/hypergraph.hg") # Replace with the actual path to your hypergraph file

Add constraints:


include_in_ds = [1, 2] # Example vertices to include
exclude_from_ds = [3, 4] # Example vertices to exclude
add_constraints_on_graph(H, include_in_ds, exclude_from_ds)

Build tree decomposition:


td = RootedDisjointBranchNiceTreeDecomposition(H, semi_dntd=True)

Create factors and run the preprocessing phase:


create_factors(td)
calculate_factors_for_mds_enum_iterative(td, options_for_labels=True)

Run the enumeration algorithm:


results = EnumMHS_iterative(td)

results is now a Python generator capable of generating all the minimal hitting sets of the hypergraph.

Examples


from PyMELib.TreeDecompositions import RootedDisjointBranchNiceTreeDecomposition
from PyMELib.PreprocessingAlgorithms import create_factors, calculate_factors_for_mds_enum_iterative
from PyMELib.EnumerationAlgorithms import EnumMHS
from PyMELib.utils.readHypergraphFromFile import read_hypergraph
from PyMELib.utils.addConstraints import add_constraints_on_graph

# 1. Load hypergraph
# Replace with the actual path to your hypergraph file
H = read_hypergraph("path/to/your/hypergraph.hg") 

# 2. create tree decomposition
td = RootedDisjointBranchNiceTreeDecomposition(H, semi_dntd=True)

# 3. Preprocess hypergraph
create_factors(td)
calculate_factors_for_mds_enum_iterative(td, options_for_labels=True)

# 4. Run enumeration
results = EnumMHS_iterative(td)

# 5. Print results
for mhs in results:
    print(mhs)

Data - Input Format

If you want to run the code in order to enumerate minimal dominating sets of a regular graph you can simply pass to the constructor of the tree decomposition a NetworkX graph object.
In the case of hypergraphs, we are consistent with the input format of the dualization repository [2]. The input file should be in the following format, each line representing a hyperedge (set) of the hypergraph, vertices are non-negative integers and are separated by spaces:

vertex1 vertex2 vertex3
vertex2 vertex4
...

Results - Output Format

The minimal hitting sets/dominating sets return as Python frozensets:


# Example output
frozenset({1, 2, 3})
frozenset({2, 4})
frozenset({1, 4})
...

Limitations

  • The efficiency relies on the input hypergraph having a small treewidth. The preprocessing phase complexity depends exponentially on the treewidth.

Authors

  • Dan S. Mizrahi
  • Batya Kenig

AI usage

This code was created with the help of GitHub Copilot and Gemini.

Further Reading

[1] Kenig, Batya, and Dan Shlomo Mizrahi. "Enumeration of Minimal Hitting Sets Parameterized by Treewidth." arXiv preprint arXiv:2408.15776 (2024).

[2] Keisuke Murakami & Takeaki Uno (uno@nii.jp). Hypergraph Dualization Repository - Program Codes and Instances for Hypergraph Dualization (minimal hitting set enumeration). https://research.nii.ac.jp/~uno/dualization.html.

[3] Miller et al., (2019). EoN (Epidemics on Networks): a fast, flexible Python package for simulation, analytic approximation, and analysis of epidemics on networks. Journal of Open Source Software, 4(44), 1731, https://doi.org/10.21105/joss.01731

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pymelib-0.5.71.tar.gz (27.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pymelib-0.5.71-py3-none-any.whl (24.8 kB view details)

Uploaded Python 3

File details

Details for the file pymelib-0.5.71.tar.gz.

File metadata

  • Download URL: pymelib-0.5.71.tar.gz
  • Upload date:
  • Size: 27.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for pymelib-0.5.71.tar.gz
Algorithm Hash digest
SHA256 fbdb628cea3d0a2f9f2c7d04af59cf328368120791e396ce7a4b4545512fab06
MD5 03f2d53a9fdbff91201d1f0ffe6e14ab
BLAKE2b-256 5eac1d346a6e483866406422fc9bc914c1027946933e106340c9b43bdf6ae298

See more details on using hashes here.

File details

Details for the file pymelib-0.5.71-py3-none-any.whl.

File metadata

  • Download URL: pymelib-0.5.71-py3-none-any.whl
  • Upload date:
  • Size: 24.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for pymelib-0.5.71-py3-none-any.whl
Algorithm Hash digest
SHA256 fa54fd2c43911b8cf680cd5b0c87fe674d1557cc1d24838820c9e36427b4adac
MD5 0d79c26993ea53e0ebc5125ce855d730
BLAKE2b-256 4e142ac374f5eb494cb3241fafefd4286a955e5052a27121cfed9a3cd7bd4dbb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page