Skip to main content

Convert ArangoDB graphs to cuGraph & vice-versa.

Project description

ArangoDB-cuGraph Adapter

CircleCI CodeQL Coverage Status Last commit

PyPI version badge Python versions badge

License Code style: black Downloads

The ArangoDB-cuGraph Adapter exports Graphs from ArangoDB, the multi-model database for graph & beyond, into RAPIDS cuGraph, a library of collective GPU-accelerated graph algorithms, and vice-versa.

About RAPIDS cuGraph

While offering a similar API and set of graph algorithms to NetworkX, RAPIDS cuGraph library is GPU-based. Especially for large graphs, this results in a significant performance improvement of cuGraph compared to NetworkX. Please note that storing node attributes is currently not supported by cuGraph. In order to run cuGraph, an Nvidia-CUDA-enabled GPU is required.

Installation

Prerequisites: A CUDA-capable GPU

Latest Release

pip install --extra-index-url=https://pypi.nvidia.com cudf-cu11 cugraph-cu11
pip install adbcug-adapter

Current State

pip install --extra-index-url=https://pypi.nvidia.com cudf-cu11 cugraph-cu11
pip install git+https://github.com/arangoml/cugraph-adapter.git

Quickstart

Open In Colab

import cudf
import cugraph

from arango import ArangoClient
from adbcug_adapter import ADBCUG_Adapter, ADBCUG_Controller

# Connect to ArangoDB
db = ArangoClient().db()

# Instantiate the adapter
adbcug_adapter = ADBCUG_Adapter(db)

ArangoDB to cuGraph

#######################
# 1.1: via Graph name #
#######################

cug_g = adbcug_adapter.arangodb_graph_to_cugraph("fraud-detection")

#############################
# 1.2: via Collection names #
#############################

cug_g = adbcug_adapter.arangodb_collections_to_cugraph(
    "fraud-detection",
    {"account", "bank", "branch", "Class", "customer"},  #  Vertex collections
    {"accountHolder", "Relationship", "transaction"},  # Edge collections
)

cuGraph to ArangoDB

#################################
# 2.1: with a Homogeneous Graph #
#################################

edges = [("Person/A", "Person/B", 1), ("Person/B", "Person/C", -1)]
cug_g = cugraph.MultiGraph(directed=True)
cug_g.from_cudf_edgelist(cudf.DataFrame(edges, columns=["src", "dst", "weight"]), source="src", destination="dst", edge_attr="weight")

edge_definitions = [
    {
        "edge_collection": "knows",
        "from_vertex_collections": ["Person"],
        "to_vertex_collections": ["Person"],
    }
]

adb_g = adbcug_adapter.cugraph_to_arangodb("Knows", cug_g, edge_definitions, edge_attr="weight")

##############################################################
# 2.2: with a Homogeneous Graph & a custom ADBCUG Controller #
##############################################################

class Custom_ADBCUG_Controller(ADBCUG_Controller):
    """ArangoDB-cuGraph controller.

    Responsible for controlling how nodes & edges are handled when
    transitioning from ArangoDB to cuGraph & vice-versa.
    """

    def _prepare_cugraph_node(self, cug_node: dict, col: str) -> None:
        """Prepare a cuGraph node before it gets inserted into the ArangoDB
        collection **col**.

        :param cug_node: The cuGraph node object to (optionally) modify.
        :param col: The ArangoDB collection the node belongs to.
        """
        cug_node["foo"] = "bar"

    def _prepare_cugraph_edge(self, cug_edge: dict, col: str) -> None:
        """Prepare a cuGraph edge before it gets inserted into the ArangoDB
        collection **col**.

        :param cug_edge: The cuGraph edge object to (optionally) modify.
        :param col: The ArangoDB collection the edge belongs to.
        """
        cug_edge["bar"] = "foo"

adb_g = ADBCUG_Adapter(db, Custom_ADBCUG_Controller()).cugraph_to_arangodb("Knows", cug_g, edge_definitions)

###################################
# 2.3: with a Heterogeneous Graph #
###################################

edges = [
   ('student:101', 'lecture:101'), 
   ('student:102', 'lecture:102'), 
   ('student:103', 'lecture:103'), 
   ('student:103', 'student:101'), 
   ('student:103', 'student:102'),
   ('teacher:101', 'lecture:101'),
   ('teacher:102', 'lecture:102'),
   ('teacher:103', 'lecture:103'),
   ('teacher:101', 'teacher:102'),
   ('teacher:102', 'teacher:103')
]
cug_g = cugraph.MultiGraph(directed=True)
cug_g.from_cudf_edgelist(cudf.DataFrame(edges, columns=["src", "dst"]), source='src', destination='dst')

# ...

# Learn how this example is handled in Colab:
# https://colab.research.google.com/github/arangoml/cugraph-adapter/blob/master/examples/ArangoDB_cuGraph_Adapter.ipynb#scrollTo=nuVoCZQv6oyi

Development & Testing

Prerequisite: arangorestore, CUDA-capable GPU

  1. git clone https://github.com/arangoml/cugraph-adapter.git
  2. cd cugraph-adapter
  3. (create virtual environment of choice)
  4. pip install --extra-index-url=https://pypi.nvidia.com cudf-cu11 cugraph-cu11
  5. pip install -e .[dev]
  6. (create an ArangoDB instance with method of choice)
  7. pytest --url <> --dbName <> --username <> --password <>

Note: A pytest parameter can be omitted if the endpoint is using its default value:

def pytest_addoption(parser):
    parser.addoption("--url", action="store", default="http://localhost:8529")
    parser.addoption("--dbName", action="store", default="_system")
    parser.addoption("--username", action="store", default="root")
    parser.addoption("--password", action="store", default="")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

adbcug_adapter-2.0.2.tar.gz (33.5 kB view hashes)

Uploaded Source

Built Distribution

adbcug_adapter-2.0.2-py3-none-any.whl (21.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page