A python package for calculating topological graph features on cpu/gpu

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Topological Graph Features

Topological feature calculators infrastructure.

Calculating Features

This package helps one to calculate features for a given graph. All features are implemented in python codes, and some features have also an accelerated version written in C++. Among the accelerated features, one can find a code for calculating 3- and 4-motifs using VDMC, a distributed algorithm to calculate 3- and 4-motifs in a GPU-parallelized way.

Versions

Last version: 0.1.46
Last stable version: 0.1.22

What Features Can Be Calculated Here?

The set of all vertex features implemented in graph-measures is the following.

Feature	Feature's name in code	Is available in gpu?	Output size for directed graph	Output size for undirected graph
Average neighbor degree	average_neighbor_degree	NO	N x 1	N x 1
Degree^	degree	NO	N x 2	N x 1
In degree	in_degree	NO	N x 1	- - - - - - -
Out degree	out_degree	NO	N x 1	- - - - - - -
Louvain^^	louvain	NO	- - - - - - -	N x 1
Hierarchy energy	hierarchy_energy	NO
Motifs3	motif3	YES	N x 13	N x 2
Motifs4	motif4	YES	N x 199	N x 6
K core	k_core	YES	N x 1	N x 1
Attraction basin	attractor_basin	YES	N x 1	- - - - - - -
Page Rank	page_rank	YES	N x 1	N x 1
Fiedler vector	fiedler_vector	NO	- - - - - - -	N x 1
Closeness centrality	closeness_centrality	NO	N x 1	N x 1
Eccentricity	eccentricity	NO	N x 1	N x 1
Load centrality	load_centrality	NO	N x 1	N x 1
BFS moments	bfs_moments	YES	N x 2	N x 2
Flow	flow	YES	N x 1	- - - - - - -
Betweenness centrality	betweenness_centrality	NO	N x 1	N x 1
Communicability betweenness centrality	communicability_betweenness_centrality	NO	- - - - - - -	N x ?
Eigenvector centrality	eigenvector_centrality	NO	N x 1	N x 1
Clustering coefficient	clustering_coefficient	NO	N x 1	N x 1
Square clustering coefficient	square_clustering_coefficient	NO	N x 1	N x 1
Generalized degree	generalized_degree	NO	- - - - - - -	N x 16
All pairs shortest path length	all_pairs_shortest_path_length	NO	N x N	N x N

^ Degree - In the undirected case return the sum of the in degree and the out degree.
^^Louvain - Implement Louvain community detection method, then associate to each vertex the number of vertices in its community.

Aside from those, there are some other edge features. Some more information regarding the features can be found in the files of features_meta.

Dependencies

setuptools
networkx==2.6.3
pandas
numpy
matplotlib
scipy
scikit-learn
python-louvain
bitstring
future
torch

How To Use The Accelerated Version (CPU/GPU)?

Both versions currently are not supported with the pip installation.
To use the accelerated version, one must use Linux operation system and Anaconda distribution, with the follow the next steps:

Go to the package's GitHub website and manually download:
- The directory graphMeasures.
- The python file runMakefileACC.py.
You might need to download a zip of the repository and extract the necessary files.
Place both the file and the directory inside your project, and run runMakefileACC.py.
Move to the boost environment: conda activate boost (The environment was created in step 2).
Use the package as explained in the section How To Use?

Installation Through pip

The full functionality of the package is currently available on a Linux machine, with a Conda environment.

Linux + Conda
1. Go to base environment
2. If pip is not installed on your env, install it. Then, use pip to install the package
Otherwise, pip must be installed.

pip install graph-measures

Note: On Linux+Conda the installation might take longer (about 5-10 minuets) due to the compilation of the c++ files.

How To Use?

Even though one has installed the package as graph-measures, The package should be imported from the code as graphMesaures. Hence, use:

import graphMeasures

Calculating Features

There are two main methods to calculate features:

Using FeatureCalculator (recommended):
A class for calculating any requested features on a given graph.
The graph is input to this class as a text-like file of edges, with a comma delimiter, or a networkx Graph object. For example, the graph example_graph.txt is the following file:

0,1
0,2
1,3
3,2

Now, an implementation of feature calculations on this graph looks like this:

import os
from graphMeasures import FeatureCalculator
# set of features to be calculated
feats = ["motif3", "louvain"]
# path to the graph's edgelist or nx.Graph object
graph = os.path.join("measure_tests", "example_graph.txt")
# The path in which one would like to save the pickled features calculated in the process. 
dir_path = "" 
# More options are shown here. For infomation about them, refer to the file.
ftr_calc = FeatureCalculator(path, feats, dir_path=dir_path, acc=True, directed=False, gpu=True, device=0, verbose=True)
# calculate the features. If one do not want the features to be saved,
# one should set the parameter 'should_dump' to False (set to True by default).
ftr_calc.calculate_features()
mx = ftr_calc.feature_matrix

More information can be found in features_for_any_graph.py.
Note: If one set acc=True without using a Linux+Conda machine, an exception will be thrown.
Note: If one set gpu=True without using a Linux+Conda machine that has cuda available on it, an exception will be thrown.

By the calculations as below (less recommended):
The calculations require an input graph in NetworkX format, later referred as gnx, and a logger. For this example, we build a gnx and define a logger:

import networkx as nx
from graphMeasures import PrintLogger

gnx = nx.DiGraph()  # should be a subclass of Graph
gnx.add_edges_from([(0, 1), (0, 2), (1, 3), (3, 2)])

logger = PrintLogger("MyLogger")

On the gnx we have, we will want to calculate the topological features. There are two options to calculate topological features here, depending on the number of features we want to calculate:

Calculate a specific feature:

import numpy as np
# Import the feature. 
# If simple, import it from vertices folder, otherwise from accelerated_graph_features: 
from graphMeasures.features_algorithms.vertices import LouvainCalculator  

feature = LouvainCalculator(gnx, logger=logger)  
feature.build()  # The building happens here

mx = feature.to_matrix(mtype=np.matrix)  # After building, one can request to get features the a matrix

Calculate a set of features (one feature can as well be calculated as written here):

import numpy as np
from graphMeasures.features_infra import GraphFeatures
from graphMeasures.features_infra.feature_calculators import FeatureMeta
from graphMeasures.features_algorithms.vertices import LouvainCalculator
from graphMeasures.features_algorithms.vertices import BetweennessCentralityCalculator

features_meta = {
   "louvain": FeatureMeta(LouvainCalculator, {"lov"}),
   "betweenness_centrality": FeatureMeta(BetweennessCentralityCalculator, {"betweenness"}),
}  # Hold the set of features as written here. 

features = GraphFeatures(gnx, features_meta, logger=logger) 
features.build()

mx = features.to_matrix(mtype=np.matrix)

Note: All the keys-values options that can be set in the features_meta variable can be found in graphMeasures.features_meta or graphMeasures.accelerated_features_meta

from graphMeasures import FeaturesMeta
# if one uses the accelerated calculation:
# from graphMeasures.accelerated_features_meta import FeaturesMeta
all_possible_features_meta = FeaturesMeta().NODE_LEVEL

# all possible features
print(all_possible_features_meta.keys())   
# get the value for louvain
louvain = all_possible_features_meta['louvain']   
# get the value for betweenness_centrality
betweenness_centrality = all_possible_features_meta['betweenness_centrality']

Ouptput

graphMeasures uses networkx's networkx.convert_node_labels_to_integers function, and then calculates the features. That's why the output matrix is ordered by the new nodes labels (from 0 to n-1) and not by the original labels.

graphMeasures sometimes return the output columns not in the order they were inserted. The

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.1.57

Dec 11, 2022

0.1.56

Dec 11, 2022

0.1.55

Dec 8, 2022

0.1.54

Dec 8, 2022

0.1.53

Dec 8, 2022

0.1.52

Dec 4, 2022

This version

0.1.51

Nov 28, 2022

0.1.50

Nov 28, 2022

0.1.49

Nov 28, 2022

0.1.48

Nov 28, 2022

0.1.47

Nov 8, 2022

0.1.46

Nov 8, 2022

0.1.45

Nov 6, 2022

0.1.44

Oct 20, 2022

0.1.43

Jul 14, 2022

0.1.42

Jul 4, 2022

0.1.41

Jul 3, 2022

0.1.40

Jul 3, 2022

0.1.39

Jul 3, 2022

0.1.38

Jun 30, 2022

0.1.37

Jun 30, 2022

0.1.36

Jun 30, 2022

0.1.35

Jun 26, 2022

0.1.34

Jun 26, 2022

0.1.33

Jun 26, 2022

0.1.32

Jun 26, 2022

0.1.31

Jun 26, 2022

0.1.29

Jun 26, 2022

0.1.28

Jun 26, 2022

0.1.27

Jun 25, 2022

0.1.26

Jun 25, 2022

0.1.25

Jun 25, 2022

0.1.24

Jun 25, 2022

0.1.23

Jun 25, 2022

0.1.22

Jun 21, 2022

0.1.20

Jun 20, 2022

0.1.19

Jun 20, 2022

0.1.18

Jun 20, 2022

0.1.17

Jun 20, 2022

0.1.16

Jun 20, 2022

0.1.15

Jun 20, 2022

0.1.14

Jun 20, 2022

0.1.13

Jun 20, 2022

0.1.12

Jun 15, 2022

0.1.10

Jun 9, 2022

0.1.9

Jun 9, 2022

0.1.8

May 26, 2022

0.1.7

May 24, 2022

0.1.6

May 24, 2022

0.1.5

May 24, 2022

0.1.4

May 24, 2022

0.1.3

May 23, 2022

0.1.2

May 23, 2022

0.1.1

May 23, 2022

0.1

May 23, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graph-measures-0.1.51.tar.gz (342.3 kB view hashes)

Uploaded Nov 28, 2022 Source

Built Distribution

graph_measures-0.1.51-cp310-cp310-win_amd64.whl (101.2 kB view hashes)

Uploaded Nov 28, 2022 CPython 3.10 Windows x86-64

Hashes for graph-measures-0.1.51.tar.gz

Hashes for graph-measures-0.1.51.tar.gz
Algorithm	Hash digest
SHA256	`ea2a1bab93dc67f3f58d77772baacf059b536a5bbc5234b7389127a251127ea1`
MD5	`edb4462f204602aa1d83d1ccce93daeb`
BLAKE2b-256	`f1862c1a76e4383577fcf13128ab5874bce75e9185f19ed4be2466af21d91fc5`

Hashes for graph_measures-0.1.51-cp310-cp310-win_amd64.whl

Hashes for graph_measures-0.1.51-cp310-cp310-win_amd64.whl
Algorithm	Hash digest
SHA256	`ce44e00d349d268ced8e27e4cb92d7856f2afeabba20d26c92b04501c7b9c961`
MD5	`037ef2e6984af1e90bedd8d60ed10f7c`
BLAKE2b-256	`09a3c19e815d2f2d5b79a7da48d85a81460fa2652b471e3291f26a1c030eef7a`