Skip to main content

NebulaGraph Data Intelligence Suite

Project description

NebulaGraph Data Intelligence Suite(ngdi)

Data Intelligence Suite with 4 line code to run Graph Algo on NebulaGraph

License PyPI version Python pdm-managed


Documentation: https://github.com/wey-gu/nebulagraph-di#documentation

Source Code: https://github.com/wey-gu/nebulagraph-di


NebulaGraph Data Intelligence Suite for Python (ngdi) is a powerful Python library that offers APIs for data scientists to effectively read, write, analyze, and compute data in NebulaGraph.

With the support of single-machine engine(NetworkX), or distributed computing environment using Spark we could perform Graph Analysis and Algorithms on top of NebulaGraph in less than 10 lines of code, in unified and intuitive API.

Quick Start in 5 Minutes

Installation

pip install ngdi

Usage

Call from nGQL

See more details in docs

RETURN ngdi("pagerank", ["follow"], ["degree"], "spark",
    {space: "basketballplayer", max_iter: 10}, {write_mode: "insert"})

Spark Engine Examples

See also: examples/spark_engine.ipynb

Run Algorithm on top of NebulaGraph:

Note, there is also query mode, refer to examples or docs for more details.

from ngdi import NebulaReader

# read data with spark engine, scan mode
reader = NebulaReader(engine="spark")
reader.scan(edge="follow", props="degree")
df = reader.read()

# run pagerank algorithm
pr_result = df.algo.pagerank(reset_prob=0.15, max_iter=10)

Write back to NebulaGraph:

from ngdi import NebulaWriter
from ngdi.config import NebulaGraphConfig

config = NebulaGraphConfig()

properties = {"louvain": "cluster_id"}

writer = NebulaWriter(
    data=df_result, sink="nebulagraph_vertex", config=config, engine="spark")
writer.set_options(
    tag="louvain", vid_field="_id", properties=properties,
    batch_size=256, write_mode="insert",)
writer.write()

Then we could query the result in NebulaGraph:

MATCH (v:louvain)
RETURN id(v), v.louvain.cluster_id LIMIT 10;

NebulaGraph Engine Examples(not yet implemented)

Basically the same as Spark Engine, but with engine="nebula".

- reader = NebulaReader(engine="spark")
+ reader = NebulaReader(engine="nebula")

Documentation

Environment Setup

API Reference

How it works

ngdi is an unified abstraction layer for different engines, the current implementation is based on Spark, NetworkX, DGL and NebulaGraph, but it's easy to extend to other engines like Flink, GraphScope, PyG etc.

          ┌───────────────────────────────────────────────────┐
          │   Spark Cluster                                   │
          │    .─────.    .─────.    .─────.    .─────.       │
          │   ;       :  ;       :  ;       :  ;       :      │
       ┌─▶│   :       ;  :       ;  :       ;  :       ;      │
       │  │    ╲     ╱    ╲     ╱    ╲     ╱    ╲     ╱       │
       │  │     `───'      `───'      `───'      `───'        │
  Algo Spark                                                  │
    Engine└───────────────────────────────────────────────────┘
       │  ┌────────────────────────────────────────────────────┬──────────┐
       └──┤                                                    │          │
          │   NebulaGraph Data Intelligence Suite(ngdi)        │ ngdi-api │◀─┐
          │                                                    │          │  │
          │                                                    └──────────┤  │
          │     ┌────────┐    ┌──────┐    ┌────────┐   ┌─────┐            │  │
          │     │ Reader │    │ Algo │    │ Writer │   │ GNN │            │  │
 ┌───────▶│     └────────┘    └──────┘    └────────┘   └─────┘            │  │
 │        │          │            │            │          │               │  │
 │        │          ├────────────┴───┬────────┴─────┐    └──────┐        │  │
 │        │          ▼                ▼              ▼           ▼        │  │
 │        │   ┌─────────────┐ ┌──────────────┐ ┌──────────┐┌──────────┐   │  │
 │     ┌──┤   │ SparkEngine │ │ NebulaEngine │ │ NetworkX ││ DGLEngine│   │  │
 │     │  │   └─────────────┘ └──────────────┘ └──────────┘└──────────┘   │  │
 │     │  └──────────┬────────────────────────────────────────────────────┘  │
 │     │             │        Spark                                          │
 │     │             └────────Reader ────────────┐                           │
 │  Spark                   Query Mode           │                           │
 │  Reader                                       │                           │
 │Scan Mode                                      ▼                      ┌─────────┐
 │     │  ┌───────────────────────────────────────────────────┬─────────┤ ngdi-udf│◀─────────────┐
 │     │  │                                                   │         └─────────┤              │
 │     │  │  NebulaGraph Graph Engine         Nebula-GraphD   │   ngdi-GraphD     │              │
 │     │  ├──────────────────────────────┬────────────────────┼───────────────────┘              │
 │     │  │                              │                    │                                  │
 │     │  │  NebulaGraph Storage Engine  │                    │                                  │
 │     │  │                              │                    │                                  │
 │     └─▶│  Nebula-StorageD             │    Nebula-Metad    │                                  │
 │        │                              │                    │                                  │
 │        └──────────────────────────────┴────────────────────┘                                  │
 │                                                                                               │
 │    ┌───────────────────────────────────────────────────────────────────────────────────────┐  │
 │    │ RETURN ngdi("pagerank", ["follow"], ["degree"], "spark", {space: "basketballplayer"}) │──┘
 │    └───────────────────────────────────────────────────────────────────────────────────────┘
 │  ┌─────────────────────────────────────────────────────────────┐
 │  │ from ngdi import NebulaReader                               │
 │  │                                                             │
 │  │ # read data with spark engine, scan mode                    │
 │  │ reader = NebulaReader(engine="spark")                       │
 │  │ reader.scan(edge="follow", props="degree")                  │
 └──│ df = reader.read()                                          │
    │                                                             │
    │ # run pagerank algorithm                                    │
    │ pr_result = df.algo.pagerank(reset_prob=0.15, max_iter=10)  │
    │                                                             │
    └─────────────────────────────────────────────────────────────┘  

Spark Engine Prerequisites

NebulaGraph Engine Prerequisites

License

This project is licensed under the terms of the Apache License 2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ngdi-0.2.6.tar.gz (18.6 kB view details)

Uploaded Source

Built Distribution

ngdi-0.2.6-py3-none-any.whl (20.2 kB view details)

Uploaded Python 3

File details

Details for the file ngdi-0.2.6.tar.gz.

File metadata

  • Download URL: ngdi-0.2.6.tar.gz
  • Upload date:
  • Size: 18.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.4.6 CPython/3.8.10

File hashes

Hashes for ngdi-0.2.6.tar.gz
Algorithm Hash digest
SHA256 19888c8488d37421196fd10aaa07accd75531dc0e416a7ed3ee86cff845e07b0
MD5 42b40959e8308623c0387d670803edb0
BLAKE2b-256 167b73df577a25d2fa316418ce9dd9ea7ad9d82e48923f158a94dec888174b38

See more details on using hashes here.

File details

Details for the file ngdi-0.2.6-py3-none-any.whl.

File metadata

  • Download URL: ngdi-0.2.6-py3-none-any.whl
  • Upload date:
  • Size: 20.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.4.6 CPython/3.8.10

File hashes

Hashes for ngdi-0.2.6-py3-none-any.whl
Algorithm Hash digest
SHA256 e9530d66cbd3516c0d252db74fc5f4df19ac87e4a218db173d25853532abc1da
MD5 be6926ef072d7556ce0a2c9c03cd08ac
BLAKE2b-256 b44ad5651593e68ebccc1d6a55e9f641eb40b731293e07127aa3a0d65c8ef4fb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page