A python package for calculating topological graph features on cpu/gpu

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Topological Graph Features

Topological feature calculators infrastructure.

Calculating Features

This repository helps one to calculate features for a given graph. All features are implemented in python codes, and some features have also an accelerated version written in C++. Among the accelerated features, one can find a code for calculating 3- and 4-motifs using VDMC, a distributed algorithm to calculate 3- and 4-motifs in a GPU-parallelized way.

What Features Can Be Calculated Here?

The set of all vertex features implemented in graph-measures is the following. The features that have an accelerated version are written in bold:

Average neighbor degree
General (i.e. degree if undirected, else or (in-degree, out-degree))
Louvain (i.e. implement Louvain community detection method, then associate to each vertex the number of vertices in its community)
Hierarchy energy
Motifs
K core
Attraction basin
Page Rank
Fiedler vector
Closeness centrality
Eccentricity
Load centrality
BFS moments
Flow
Betweenness centrality
Communicability betweenness centrality
Eigenvector centrality
Clustering coefficient
Square clustering coefficient
Generalized degree
All pairs shortest path length
All pairs shortest path

Aside from those, there are some other edge features. Some more information regarding the features can be found in the files of features_meta.

NOTE: For codes relating the motifs and their calculations, one might need to create the motif variation pickle files in features_algorithms/motif_variations. To do so, one needs to run isomorphic.py.

Requirements

networkx>=2.8
pandas>=1.4.2
numpy>=1.22.3
matplotlib>=3.5.1
scipy>=1.8.0
scikit-learn>=1.0.2
python-louvain>=0.16
bitstring>=3.1.9

Calculating Features

There are two main methods to calculate features:

Using features_for_any_graph.py - A file for calculating any requested features on a given graph. The class for calculating features in this file is FeatureCalculator.
The graph is input to this file as a text-like file of edges, with a comma delimiter. For example, the graph example_graph.txt is the following file:

0,1
0,2
1,3
3,2

Now, an implementation of feature calculations on this graph looks like this:

import os
from features_for_any_graph import FeatureCalculator
feats = ["motif3", "louvain"]  # Any set of features
path = os.path.join("measure_tests", "example_graph.txt") 
head = "" # The path in which one would like to keep the pickled features calculated in the process. 
# More options are shown here. For infomation about them, refer to the file.
ftr_calc = FeatureCalculator(path, head, feats, acc=True, directed=False, gpu=True, device=0, verbose=True)
ftr_calc.calculate_features()

More information can be found in features_for_any_graph.py.

By the calculations as below: The calculations require an input graph in NetworkX format, later referred as gnx, and a logger. For this example, we build a gnx and define a logger:

import networkx as nx
from loggers import PrintLogger

gnx = nx.DiGraph()  # should be a subclass of Graph
gnx.add_edges_from([(0, 1), (0, 2), (1, 3), (3, 2)])

logger = PrintLogger("MyLogger")

On the gnx we have, we will want to calculate the topological features. There are two options to calculate topological features here, depending on the number of features we want to calculate:

Calculate a specific feature:

import numpy as np
# Import the feature. 
# If simple, import it from vertices folder, otherwise from accelerated_graph_features: 
from features_algorithms.vertices.louvain import LouvainCalculator  

feature = LouvainCalculator(gnx, logger=logger)  
feature.build()  # The building happens here

mx = feature.to_matrix(mtype=np.matrix)  # After building, one can request to get features the a matrix

Calculate a set of features (one feature can as well be calculated as written here):

import numpy as np
from features_infra.graph_features import GraphFeatures

from features_algorithms.vertices.louvain import LouvainCalculator
from features_algorithms.vertices.betweenness_centrality import BetweennessCentralityCalculator

features_meta = {
   "louvain": FeatureMeta(LouvainCalculator, {"lov"}),
  "betweenness_centrality": FeatureMeta(BetweennessCentralityCalculator, {"betweenness"}),
}  # Hold the set of features as written here. 

features = GraphFeatures(gnx, features_meta, logger=logger) 
features.build()

mx = features.to_matrix(mtype=np.matrix)

How to use accelerated features?

The accelerated feature calculation option requires some prior work, since its files are C++ files which require making. The complete manual can be found here.

In short, for the GPU accelerated version using existing anaconda 3:

In case of problems with pickle files (e.g. inability to download), one can use the following file to recreate the motif variation pickle files.

Move into the path features_algorithms/accelerated_graph_features/src.
Create the conda environment required in which the work will be done: "conda env create -f env.yml"
Activate the new environment: "conda activate boost".
Then, make the files required for accelerated graph measures including motifs on GPU:
"make -f Makefile-gpu" (this might take a while).
Now, the accelerated graph features (including the GPU based motif calculations) can be used.

VDMC Results directory

The directory called vdmc_results includes results used for plotting some figures in the VDMC paper. The code files for receiving these results are located there, as well as the results files.

List-of-Lists (CSR) Format

Our format is aimed at increasing the efficiency of our code by leveraging the computer's internal cache mechanism. We call this structure a List Of Lists Graph. The LOL Graph is composed of two arrays:

An array which is composed of all the neighborhood lists placed back to back (Neighbors).
An array which holds the starting index of each node's neighbor list (In- dices).

There follows an example of the conversion routine for a simple graph. The given graph is the one composed of these edges: (0->1, 0->2,0->3,2->0,3->1,3->2). The behavior of the conversion now depends on whether the graph is directed or undirected.

• If the graph is directed, the result is as follows: Indices: [0, 3, 3, 4, 6], Neighbors: [1, 2, 3, 0, 1, 2]

• Else, the results for the undirected graph are: Indices: [0, 3, 5, 7, 10], Neighbors: [1, 2, 3, 0, 3, 0, 3, 0, 1, 2]

The LOL Graph object is designed around the principle of cache-awareness. The most important thing to remember when accessing the graph is that we are aiming to accelerate the computations by loading sections of the graph into the cache ahead of time for quick access. When using the LOL Graph, this comes into effect when we iterate over the ofset vector first and then access the blocks of neighbor nodes in the graph vector. By doing this, we are pulling the entire list of a certain node's neighbors into the cache, allowing us to iterate over them extremely quickly.

How to use?

For undirected graph:


undirected_graph = LolGraph(directed=False, weighted=True)

undirected_graph.convert(edgeList) //For example: undirected_graph.convert([[1,2,3], [4,6,0.1]])

For directed graph:

directed_graph= DLGW(weighted=True)

directed_graph.convert(edgeList) //For example: directed_graph.convert([[1,2,3], [4,6,0.1]])

Basically, you can use LolGraph class to create a directed graph, but some functions (in_degrees, predecessors) are not implemented in the most effecient way. Therefore, use LolGraph for undirected graph, and DLGW for directed graph.

The name of the implemented functions are the same as if you were using Networkx library.

Dockers - Optional

The docker is designed to have all the things needed to run the code, including the boost environment and a proper Cuda version. The docker file may be found in the dockers directory.

Please note: It's needed to compile the code before using the docker.

To use it:

Go to the dockers directory

Build:

docker build -t <Enter any wanted name>:01 .

Run:

docker run -it --rm --gpus device=<The ordinal number of the gpu> -v <Path to graph-measures directory>:<Path to graph-measures directory> <The chosen name from the building>:01 bash

Inside the docker, find the graph-measures directory and run the code :smiley:

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.1.57

Dec 11, 2022

0.1.56

Dec 11, 2022

0.1.55

Dec 8, 2022

0.1.54

Dec 8, 2022

0.1.53

Dec 8, 2022

0.1.52

Dec 4, 2022

0.1.51

Nov 28, 2022

0.1.50

Nov 28, 2022

0.1.49

Nov 28, 2022

0.1.48

Nov 28, 2022

0.1.47

Nov 8, 2022

0.1.46

Nov 8, 2022

0.1.45

Nov 6, 2022

0.1.44

Oct 20, 2022

0.1.43

Jul 14, 2022

0.1.42

Jul 4, 2022

0.1.41

Jul 3, 2022

0.1.40

Jul 3, 2022

0.1.39

Jul 3, 2022

0.1.38

Jun 30, 2022

0.1.37

Jun 30, 2022

0.1.36

Jun 30, 2022

0.1.35

Jun 26, 2022

0.1.34

Jun 26, 2022

0.1.33

Jun 26, 2022

0.1.32

Jun 26, 2022

0.1.31

Jun 26, 2022

0.1.29

Jun 26, 2022

0.1.28

Jun 26, 2022

0.1.27

Jun 25, 2022

0.1.26

Jun 25, 2022

0.1.25

Jun 25, 2022

0.1.24

Jun 25, 2022

0.1.23

Jun 25, 2022

0.1.22

Jun 21, 2022

0.1.20

Jun 20, 2022

0.1.19

Jun 20, 2022

0.1.18

Jun 20, 2022

0.1.17

Jun 20, 2022

0.1.16

Jun 20, 2022

0.1.15

Jun 20, 2022

0.1.14

Jun 20, 2022

0.1.13

Jun 20, 2022

This version

0.1.12

Jun 15, 2022

0.1.10

Jun 9, 2022

0.1.9

Jun 9, 2022

0.1.8

May 26, 2022

0.1.7

May 24, 2022

0.1.6

May 24, 2022

0.1.5

May 24, 2022

0.1.4

May 24, 2022

0.1.3

May 23, 2022

0.1.2

May 23, 2022

0.1.1

May 23, 2022

0.1

May 23, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graph-measures-0.1.12.tar.gz (374.9 kB view hashes)

Uploaded Jun 15, 2022 Source

Built Distribution

graph_measures-0.1.12-py2.py3-none-any.whl (462.1 kB view hashes)

Uploaded Jun 15, 2022 Python 2 Python 3

Hashes for graph-measures-0.1.12.tar.gz

Hashes for graph-measures-0.1.12.tar.gz
Algorithm	Hash digest
SHA256	`122d79498d5840b3f52044c120d80f4d4d9c9f7890c75dddc1d59c3f04e0c0c5`
MD5	`68c7ae3e4c8c23255ea2fdcf37322feb`
BLAKE2b-256	`2b06e14bb22dd9af285374d1016eb3eafe4d985090c3f12cf572e4a72e170534`

Hashes for graph_measures-0.1.12-py2.py3-none-any.whl

Hashes for graph_measures-0.1.12-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`9f29d69eb9f4c7f33a249812aaa835a023492ee0c47ff289899ce2bc5d98fc62`
MD5	`af69d969ed1536162a11c131ddcc5de1`
BLAKE2b-256	`2ceaf2628eb6eb95221b909c700404d626ac51df84a19d99ffe465def77afce6`