Skip to main content

A library of high-performant graph algorithms.

Project description

graph_mate

Python binding for the set of graph crates.

graph_mate is a Python API that provides a collection of high-performant graph algorithms. It provides implementations for directed and undirected graphs.

Graphs can be created programatically or read from the Graph500 input format.

The library is implemented in Rust and uses rayon for running graph creation and algorithm execution in parallel without holding on to the Global Interpreter Lock or using multiprocessing.

The graph itself is implemented as a Compressed-Sparse-Row (CSR) data structure which is tailored for fast and concurrent access to the graph topology.

graph_mate provides a few graph algorithms. The algorithm implementations are designed to run efficiently on large-scale graphs with billions of nodes and edges.

Note: The development is mainly driven by Neo4j developers. However, the library is not an official product of Neo4j.

Usage

Installation

graph_mate is available on PyPI:

pip install graph-mate

It is currently build for x86_64 on Windows/Mac/Linux and as universal for Apple Silicon Macs.

If you need to use graph_mate for a different architecture or system, please refer to the manual installation.

Usage

What is a graph?

A graph consists of nodes and edges where edges connect exactly two nodes. A graph can be either directed, i.e., an edge has a source and a target node or undirected where there is no such distinction.

In a directed graph, each node u has outgoing and incoming neighbors. An outgoing neighbor of node u is any node v for which an edge (u, v) exists. An incoming neighbor of node u is any node v for which an edge (v, u) exists.

In an undirected graph there is no distinction between source and target node. A neighbor of node u is any node v for which either an edge (u,v) or (v, u) exists.

How to create graphs

Currently there are two ways to build a graph. You can either load graphs in the Graph500 format, for example by downloading them from the LDBC Graphalytics site. Alternatively, you can provide a numpy edge list using from_numpy.

import graph_mate as gm
import numpy as np

# Let's load a small graph:
#    (a)-->(b)-->(c)-->(d), (a)-->(c), (b)-->(d)
# To load from an edge list, we need to create a 2d numpy array of `uint32`s.
edge_list = np.array([
    # (a)-->(b)
    [0, 1],
    # (a)-->(c)
    [0, 2],
    # (b)-->(c)
    [1, 2],
    # (b)-->(d)
    [1, 3],
    # (c)-->(d)
    [2, 3]
], dtype=np.uint32)

To build a directed graph, you would create a graph_mate.DiGraph and an undirected on with graph_mate.Graph.

# We can load a directed graph from the edge list
directed = gm.DiGraph.from_numpy(edge_list)

# Or we can load an undirected graph
undirected = gm.Graph.from_numpy(edge_list)

To make assertions easier, we can create graphs with a sorted adjacency list by providing an optional second argument of type graph_mate.Layout.

directed = gm.DiGraph.from_numpy(edge_list, gm.Layout.Sorted)

undirected = gm.Graph.from_numpy(edge_list, gm.Layout.Sorted)

When loading from a numpy edge list, the data is not shared but copied into the graph. The numpy arrays can be deleted afterwards.

We can inspect the graph with a few methods.

assert directed.node_count() == 4
assert directed.edge_count() == 5

assert directed.out_degree(1) == 2
assert directed.in_degree(1) == 1

assert np.array_equal(directed.out_neighbors(1), [2, 3])
assert np.array_equal(directed.in_neighbors(1), [0])

Neighbors are returned as a numpy array view into the graph, which means we cannot modify the array.

neighbors = directed.out_neighbors(1)
try:
    neighbors[0] = 42
except ValueError as e:
    assert str(e) == "assignment destination is read-only"

In order to use the neighbors as a Python list and not a numpy array, we can use copy_ methods.

neighbors = directed.copy_out_neighbors(1)

assert neighbors == [2, 3]
assert type(neighbors) == list

For undirected graphs, we don't have directional methods for the degree or the neighbors.

assert undirected.node_count() == 4
assert undirected.edge_count() == 5

assert undirected.degree(1) == 3

assert np.array_equal(undirected.neighbors(1), [0, 2, 3])

How to run algorithms

In the following we will demonstrate running Page Rank, a graph algorithm to determine the importance of nodes in a graph based on the number and quality of their incoming edges. Page Rank requires a directed graph and returns the rank value for each node.

# https://en.wikipedia.org/wiki/PageRank#/media/File:PageRanks-Example.svg

graph = gm.DiGraph.from_numpy(np.array([
    (1,2), # B->C
    (2,1), # C->B
    (4,0), # D->A
    (4,1), # D->B
    (5,4), # E->D
    (5,1), # E->B
    (5,6), # E->F
    (6,1), # F->B
    (6,5), # F->E
    (7,1), # G->B
    (7,5), # F->E
    (8,1), # G->B
    (8,5), # G->E
    (9,1), # H->B
    (9,5), # H->E
    (10,1), # I->B
    (10,5), # I->E
    (11,5), # J->B
    (12,5), # K->B
], dtype=np.uint32))

pr_result = graph.page_rank(max_iterations=10, tolerance=1e-4, damping_factor=0.85)

assert pr_result.ran_iterations == 10

expected = np.array([
    0.024064068,
    0.3145448,
    0.27890152,
    0.01153846,
    0.029471997,
    0.06329483,
    0.029471997,
    0.01153846,
    0.01153846,
    0.01153846,
    0.01153846,
    0.01153846,
    0.01153846,
], dtype=np.float32)

assert np.array_equal(pr_result.scores(), expected)

Example Notebooks

For more examples and demos, please refer to the notebooks in the notebooks directory.

Developing

The Python extension is written using PyO3 together with maturin.

One-time setup

# Run the command from the extension directory, not the git root
# cd crates/mate

python -m venv .env
source .env/bin/activate
pip install -r requirements/dev.txt

Once-per-new-terminal setup

Make sure that you're activating the venv in every new terminal where you want to develop.

source .env/bin/activate

Building the extension

Build in debug mode.

maturin develop

Build in release mode.

maturin develop --release

Rebuild the extension in release mode 2 seconds after the last file change. This is an optional step.

cargo watch --shell 'maturin develop --release' --delay 2

Testing

Running the tests

pytest tests

Formatting and linting

# Runs code formatter https://pypi.org/project/black/
black tests

# Sort imports using https://pypi.org/project/isort/
isort tests

# Verify with https://pypi.org/project/flake8/
flake8 tests

# Very types using http://mypy-lang.org
mypy .

License: MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graph_mate-0.1.1.tar.gz (83.6 kB view details)

Uploaded Source

Built Distributions

graph_mate-0.1.1-cp38-abi3-win_amd64.whl (345.9 kB view details)

Uploaded CPython 3.8+ Windows x86-64

graph_mate-0.1.1-cp38-abi3-win32.whl (323.6 kB view details)

Uploaded CPython 3.8+ Windows x86

graph_mate-0.1.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ x86-64

graph_mate-0.1.1-cp38-abi3-manylinux_2_12_i686.manylinux2010_i686.whl (1.4 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.12+ i686

graph_mate-0.1.1-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl (952.7 kB view details)

Uploaded CPython 3.8+ macOS 10.9+ universal2 (ARM64, x86-64) macOS 10.9+ x86-64 macOS 11.0+ ARM64

graph_mate-0.1.1-cp38-abi3-macosx_10_7_x86_64.whl (492.8 kB view details)

Uploaded CPython 3.8+ macOS 10.7+ x86-64

File details

Details for the file graph_mate-0.1.1.tar.gz.

File metadata

  • Download URL: graph_mate-0.1.1.tar.gz
  • Upload date:
  • Size: 83.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.10

File hashes

Hashes for graph_mate-0.1.1.tar.gz
Algorithm Hash digest
SHA256 e7ad8c16d5b5eaa00dc3aabdaa130ab00deffe85393a442ac42d6a61ec902290
MD5 20a648d718d53bda9dbc0fa6b8330d71
BLAKE2b-256 75915a95c795b81422ef8d9831db9069a52ab93e0175731fbbe1588b510e3012

See more details on using hashes here.

File details

Details for the file graph_mate-0.1.1-cp38-abi3-win_amd64.whl.

File metadata

  • Download URL: graph_mate-0.1.1-cp38-abi3-win_amd64.whl
  • Upload date:
  • Size: 345.9 kB
  • Tags: CPython 3.8+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.10

File hashes

Hashes for graph_mate-0.1.1-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 e0d4a684d91e8518dc746b8f4a60c86b91e3e13bfd8dca6945ad2952a75334db
MD5 6fb31a6f68bed4f2960517167e16ad66
BLAKE2b-256 396d4dab1aca22eadcf64acf659ad993671b4a9e2bc44174ebae5eb2e96f33a6

See more details on using hashes here.

File details

Details for the file graph_mate-0.1.1-cp38-abi3-win32.whl.

File metadata

  • Download URL: graph_mate-0.1.1-cp38-abi3-win32.whl
  • Upload date:
  • Size: 323.6 kB
  • Tags: CPython 3.8+, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.10

File hashes

Hashes for graph_mate-0.1.1-cp38-abi3-win32.whl
Algorithm Hash digest
SHA256 bb89d656b871979da47289ffb6c9f57d6d4c3dc1445b8bbe21db1de078a699ce
MD5 ba4d00e39ca57f3b60bf52a739ed953b
BLAKE2b-256 a8a89950175a97bdca3c59dd0bc42db646011a63ab0c608eacd7b4322f6080f5

See more details on using hashes here.

File details

Details for the file graph_mate-0.1.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for graph_mate-0.1.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b0f6e4d094c5bb867ae3e247d6487409438456d03ba0cf385cacebc571fb7c7a
MD5 98d8aa9633a30483eba2ad5705a6e6c8
BLAKE2b-256 30eb1458458a34a5b7edf6299ba75aacc24876cfc71d21ba949bcdb9b2208463

See more details on using hashes here.

File details

Details for the file graph_mate-0.1.1-cp38-abi3-manylinux_2_12_i686.manylinux2010_i686.whl.

File metadata

File hashes

Hashes for graph_mate-0.1.1-cp38-abi3-manylinux_2_12_i686.manylinux2010_i686.whl
Algorithm Hash digest
SHA256 2048743560f5bf92110758ce84851bd0d32eaf56091639c09d42d341973dc440
MD5 8e0970052cf7182d20538e9345562bd6
BLAKE2b-256 4aca65faba79e82575f71a72a9cf6221060ce8938bfa457fd19ca7b7ff256fb4

See more details on using hashes here.

File details

Details for the file graph_mate-0.1.1-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for graph_mate-0.1.1-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 fe773052487e0de3c06fe49be5368051dc55004be72901d250e21bf05a18ca45
MD5 d6525c12337a714df6e5c111efce9ce5
BLAKE2b-256 25d441ae8b2b7b16c98884931257b9f0fc60d63bfb326f4f78caeb4113dfd621

See more details on using hashes here.

File details

Details for the file graph_mate-0.1.1-cp38-abi3-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for graph_mate-0.1.1-cp38-abi3-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 0392e41da8f03ebb86cdcbe8a49ee3f29060982a54b5f0ef988c602ad0894f56
MD5 83e7bdb34bbdb76d62e2003b320b5ae0
BLAKE2b-256 488d4ddcd47f5bf11627bd733b984a7de9a3e02e54aa29abea00980ec51c8f52

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page