Skip to main content

First version of the PyMELib (Python Minimal Enumeration Library) library

Project description

PyMELib (Python Minimal Enumeration Library) - A Python Library for Fixed-Parameter-Linear Delay Enumeration of Minimal Dominating Sets and Minimal Hitting Sets

This repository houses a Python implementation of the algorithm presented in the research paper "Enumeration of Minimal Hitting Sets Parameterized by Treewidth" by Kenig and Mizrahi (2024) [1]. The library aims to provide an efficient way to enumerate minimal hitting sets (and minimal dominating sets) of hypergraphs (of graphs), leveraging the theoretical guarantees outlined in the paper, particularly focusing on fixed-parameter tractability with respect to treewidth. This library also includes an implementation of the novel disjoint branch nice tree decomposition data structure.

Introduction

Minimal Hitting Set (MHS) enumeration is a fundamental problem with applications in various domains like databases, AI, bioinformatics, and constraint satisfaction. Given a hypergraph (a collection of sets), the goal is to find all minimal subsets of the hypergraph's vertex set that have a non-empty intersection with every hyperedge (set in the collection). While generally it is not known if the problem has computationally tractable delay-time solution, the algorithm by Kenig and Mizrahi provides an efficient approach for hypergraphs of bounded treewidth, achieving fixed-parameter-linear delay after an FPT preprocessing phase. This library implements that approach.

Files

  • PyMELib/
    • utils/: Directory containing helper functions for:
      • readHypergraphFromFile.py: Reading input files and transforming them to the reduction graph.
      • addConstraints.py: Adding constraints on vertices inclusion/exclusion in the output.
      • graph_utils.py: General graph utilities.
      • comb_utils.py: Combinatorial utilities.
      • labels_utils.py: Utilities related to labels.
    • EnumerationAlgorithms.py: Contains the enumeration algorithms.
    • TreeDecompositions.py: Handles tree decomposition related tasks.
    • PreprocessingAlgorithms.py: Contains preprocessing algorithms.
    • Factors.py: Contains classes used as factors.
    • labels.py: Handling Labels' definition and logic as defined in [1].
  • tests/: Contains unit tests for various components (e.g., test_basicEnumDSAlgo.py, test_basicEnumHSAlgo.py).
  • setup.py: Build and installation script.

Functionality

The following list include the main functionalities of the library, more detailed descriptions can be found in the rest of this README and inside the docstrings of the code:
  • Tree Decompositions: Includes different classes for different types of tree decompositions and functions to work with them. You can look at TreeDecompositions.py which includes:
    1. RootedTreeDecomposition: Class that represents some rooted junction tree of a graph. Its __init__ gets a NetworkX graph and construct a rooted tree decomposition of it using NetworkX.junction_tree. This class also includes a basic visualisation option of the tree decomposition, self.draw.
    2. RootedNiceTreeDecomposition: Class that represents a nice tree decomposition (it inherits from RootedNiceTreeDecomposition). It has two modes for "niceness": 1. regular "niceness" - when semi_nice=False it transforms the regular rooted tree decomposition to a nice one using the function self.transform_to_nice_rec. 2. semi-nice "niceness" - when semi_nice=True it transforms the regular rooted tree decomposition to another one using self.transform_to_semi_nice_rec, if the rooted tree decomposition can be transformed into a disjoint branch one, it will transform it, if not, the function will transform it to the closest tree to a disjoint one, i.e. the tree won;t have the same vertex in both branches if it doesn't have to be there. Additionally, this class also provide a complete order of the vertices Q as described in [1].
    3. NodeType: This IntEnum represents the types of nodes (bags) in nice tree decomposition, i.e. Forget, Join, ...
    4. RootedDisjointBranchNiceTreeDecomposition: This new data structure introduced in [1] is implemented using this class. Like before it includes a regular option using semi_dntd = False and a "semi" one using semi_dntd = True.
  • Preprocessing: Implements necessary preprocessing algorithm as described in [1], constructing the appropriate factors before enumeration for the RootedDisjointBranchNiceTreeDecomposition. You can look at PreprocessingAlgorithms.py which includes:
    1. create_factors(td: RootedDisjointBranchNiceTreeDecomposition): Creates empty factors for the nodes of the TD before the preprocessing phase.
    2. calculate_factors_for_mds_enum(td: RootedDisjointBranchNiceTreeDecomposition, current_node: int, options_for_labels=False): This is a dynamic programming algorithm that calculates the factors of the TD as described in [1].
    3. calculate_factors_for_mds_enum_iterative(td: RootedDisjointBranchNiceTreeDecomposition, options_for_labels=False): This is a for loop version of calculate_factors_for_mds_enum, using a stack. Much better in terms of both memory and time consumption.
  • Core Enumeration Algorithms: The main implementation of the minimal dominating set enumeration with fixed-parameter-linear delay. You can look at EnumerationAlgorithms.py which includes:
    1. IsExtendable(td: RootedDisjointBranchNiceTreeDecomposition, theta, i): IsExtendable procedure described in [1].
    2. IncrementLabeling(td: RootedDisjointBranchNiceTreeDecomposition, theta: Dict[str, Label], i, c: Label, V_label_S, V_label_W): An efficient version of IncrementLabeling procedure described in [1].
    3. EnumMDS(td: RootedDisjointBranchNiceTreeDecomposition, theta: Dict[str, Label] = dict(), i=0, debug_flag=False, options_for_labels=False): The algorithm for minimal dominating sets enumeration described in [1]. Theoretically the time delay is bounded by $O(nw)$, $n$ being the number of nodes in the graph and $w$ is the graph's treewidth.
    4. EnumMDS_iterative(td: RootedDisjointBranchNiceTreeDecomposition, debug_flag=False, options_for_labels=False): This is a for loop version of EnumMDS, using a stack.
    5. EnumMHS(td: RootedDisjointBranchNiceTreeDecomposition, theta: Dict[str, Label] = dict(), i=0, debug_flag=False): This is a version of EnumMDS designated for minimal hitting set enumeration.
    6. EnumMHS_iterative(td: RootedDisjointBranchNiceTreeDecomposition, debug_flag=False): This is a for loop version of EnumMHS, using a stack.
  • Input: Utilities for reading hypergraph data in the input format of the dualization repository [2], constructing the proper reduction to enumeration of minimal dominating sets described in [1] and adding other inclusion/exclusion constraints on the problem. You can look at utils/readHypergraphFromFile.py and utils/addConstraints.py which include:
    1. add_constraints_on_graph(G: nx.Graph, include_in_ds: Iterable = [], exclude_from_ds: Iterable = []): This function is a helper function to add constraints on a graph, before the running of the preprocessing phase or the enumeration phase. There are two kinds of constraints: include_in_ds and exclude_from_ds. After using this function, remember to call the preprocessing phase and the enumeration phase with options_for_labels=True.
    2. read_hypergraph(path_to_file: str) -> nx.Graph: Reads a hypergraph from a file in the designated format (see Data - Input Format) and transform it to the reduction graph described in [1].

Getting Started

Prerequisites

  • Python 3.9.6 or higher (as specified by you)
  • Dependencies:
    • NetworkX
    • Matplotlib
    • EoN (Epidemics on Networks) [3]
    • Plotly
    • tqdm

Installation

Using pip: pip install PyMELib

or clone the repository and install it locally:

  1. Clone the repository:
    git clone https://github.com/Dan551Mizrahi/PyMELib.git
    cd PyMELib
  2. Install the package (preferably in a virtual environment):
    pip install .
    or for development:
    pip install -e .

Running

Import the library contents:


from PyMELib.TreeDecompositions import RootedDisjointBranchNiceTreeDecomposition
from PyMELib.PreprocessingAlgorithms import create_factors, calculate_factors_for_mds_enum_iterative
from PyMELib.EnumerationAlgorithms import EnumMHS
from PyMELib.utils.readHypergraphFromFile import read_hypergraph from PyMELib.utils.addConstraints import add_constraints_on_graph

Read your hypergraph:


H = read_hypergraph("path/to/your/hypergraph.hg") # Replace with the actual path to your hypergraph file

Add constraints:


include_in_ds = [1, 2] # Example vertices to include
exclude_from_ds = [3, 4] # Example vertices to exclude
add_constraints_on_graph(H, include_in_ds, exclude_from_ds)

Build tree decomposition:


td = RootedDisjointBranchNiceTreeDecomposition(H, semi_dntd=True)

Create factors and run the preprocessing phase:


create_factors(td)
calculate_factors_for_mds_enum_iterative(td, options_for_labels=True)

Run the enumeration algorithm:


results = EnumMHS_iterative(td)

results is now a Python generator capable of generating all the minimal hitting sets of the hypergraph.

Examples


from PyMELib.TreeDecompositions import RootedDisjointBranchNiceTreeDecomposition
from PyMELib.PreprocessingAlgorithms import create_factors, calculate_factors_for_mds_enum_iterative
from PyMELib.EnumerationAlgorithms import EnumMHS
from PyMELib.utils.readHypergraphFromFile import read_hypergraph
from PyMELib.utils.addConstraints import add_constraints_on_graph

# 1. Load hypergraph
# Replace with the actual path to your hypergraph file
H = read_hypergraph("path/to/your/hypergraph.hg") 

# 2. create tree decomposition
td = RootedDisjointBranchNiceTreeDecomposition(H, semi_dntd=True)

# 3. Preprocess hypergraph
create_factors(td)
calculate_factors_for_mds_enum_iterative(td, options_for_labels=True)

# 4. Run enumeration
results = EnumMHS_iterative(td)

# 5. Print results
for mhs in results:
    print(mhs)

Data - Input Format

If you want to run the code in order to enumerate minimal dominating sets of a regular graph you can simply pass to the constructor of the tree decomposition a NetworkX graph object.
In the case of hypergraphs, we are consistent with the input format of the dualization repository [2]. The input file should be in the following format, each line representing a hyperedge (set) of the hypergraph, vertices are non-negative integers and are separated by spaces:

vertex1 vertex2 vertex3
vertex2 vertex4
...

Results - Output Format

The minimal hitting sets/dominating sets return as Python frozensets:


# Example output
frozenset({1, 2, 3})
frozenset({2, 4})
frozenset({1, 4})
...

Limitations

  • The efficiency relies on the input hypergraph having a small treewidth. The preprocessing phase complexity depends exponentially on the treewidth.

Authors

  • Dan S. Mizrahi
  • Batya Kenig

AI usage

This code was created with the help of GitHub Copilot and Gemini.

Further Reading

[1] Kenig, Batya, and Dan Shlomo Mizrahi. "Enumeration of Minimal Hitting Sets Parameterized by Treewidth." arXiv preprint arXiv:2408.15776 (2024).

[2] Keisuke Murakami & Takeaki Uno (uno@nii.jp). Hypergraph Dualization Repository - Program Codes and Instances for Hypergraph Dualization (minimal hitting set enumeration). https://research.nii.ac.jp/~uno/dualization.html.

[3] Miller et al., (2019). EoN (Epidemics on Networks): a fast, flexible Python package for simulation, analytic approximation, and analysis of epidemics on networks. Journal of Open Source Software, 4(44), 1731, https://doi.org/10.21105/joss.01731

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pymelib-0.6.0.tar.gz (27.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pymelib-0.6.0-py3-none-any.whl (24.9 kB view details)

Uploaded Python 3

File details

Details for the file pymelib-0.6.0.tar.gz.

File metadata

  • Download URL: pymelib-0.6.0.tar.gz
  • Upload date:
  • Size: 27.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for pymelib-0.6.0.tar.gz
Algorithm Hash digest
SHA256 78fee10d4e2b9cf474fe2dc60d25fe30bfc201034613a52dc699da5eb6ccb5df
MD5 89ed01f4e80c72a1eb9e05088a6fd3b7
BLAKE2b-256 480b320144fc59f82c7e86d7b308c3616f670153cda791759b40bd9b47a678c7

See more details on using hashes here.

File details

Details for the file pymelib-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: pymelib-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 24.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for pymelib-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 36aae809b6ac45f598947d3a6cd339f8777841cd719ef83cbd54de55b933066d
MD5 65306bb7f8a4a351fcba402d6649964b
BLAKE2b-256 f4b0a649c6f924cd3026386151067665cbcddaa31df579b004a333609de2481c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page