Skip to main content

NebulaGraph Data Intelligence Suite

Project description

ngdi

NebulaGraph Data Intelligence Suite

API

NebulaReader

ngdi.NebulaReader reads data from NebulaGraph and constructs NebulaDataFrameObject or NebulaGraphObject. It supports different engines, including Spark and NebulaGraph. The default engine is Spark.

Per each engine, ngdi.NebulaReader supports different read modes, including query, scan, and load. For now, Spark Engine supports query, scan and load modes, while NebulaGraph Engine supports query and scan modes.

In Spark Engine, all modes are underlyingly implemented by NebulaGraph Spark Connector nGQL Reader, while in the future, query mode will be optionally done by opencypher/morpheus to bypass the Nebula-GraphD.

The NebulaDataFrameObject returned by ngdi.NebulaReader.read() is a Spark DataFrame, which can be further processed by Spark SQL or Spark MLlib. And the graph returned by ngdi.NebulaReader.to_graph() is a GraphX Graph, which can be further processed by GraphX. In the future, the graph returned by ngdi.NebulaReader.to_graph() will support GraphFrame, too.

In NebulaGraph Engine, query mode is implemented by Python Nebula-GraphD Client, while scan mode the Nebula-StoreageD Client.

The NebulaDataFrameObject returned by ngdi.NebulaReader.read() is a Pandas DataFrame, which can be further processed by Pandas. And the graph returned by ngdi.NebulaReader.to_graph() is a NetworkX Graph, which can be further processed by NetworkX.

Functions

  • ngdi.NebulaReader.query() sets the query statement.
  • ngdi.NebulaReader.scan() sets the scan statement.
  • ngdi.NebulaReader.load() sets the load statement.
  • ngdi.NebulaReader.read() executes the read operation and returns a DataFrame or NebulaGraphObject.
  • ngdi.NebulaReader.to_graph() converts the DataFrame returned by ngdi.NebulaReader.read() to a NebulaGraphObject.
  • ngdi.NebulaReader.get_graph() returns the NebulaGraphObject.
  • ngdi.NebulaReader.get_dataframe() returns the DataFrame object.
  • ngdi.NebulaReader.show() shows the DataFrame returned by ngdi.NebulaReader.read().

Examples

  • Spark Engine Examples
from ngdi import NebulaReader

# read data with spark engine, query mode
reader = NebulaReader(engine="spark")
query = """
    MATCH ()-[e:follow]->()
    RETURN e LIMIT 100000
"""
reader.query(query=query, edge="follow", props="degree")
df = reader.read() # this will take some time
df.show(10)

# read data with spark engine, scan mode
reader = NebulaReader(engine="spark")
reader.scan(edge="follow", props="degree")
df = reader.read() # this will take some time
df.show(10)

# read data with spark engine, load mode
reader = NebulaReader(engine="spark")
reader.load(source="hdfs://path/to/edge.csv", format="csv", header=True, schema="src: string, dst: string, rank: int")
df = reader.read() # this will take some time
df.show(10)

# run connected components algorithm
cc_result = df.algo.connected_components() # this will take some time

# convert dataframe to NebulaGraphObject
graph = reader.to_graph() # this will take some time
graph.vertices.show(10)
graph.edges.show(10)

# run pagerank algorithm
pr_result = graph.algo.pagerank(reset_prob=0.15, max_iter=10) # this will take some time
  • NebulaGraph Engine Examples
from ngdi import NebulaReader

# read data with nebula engine, query mode
reader = NebulaReader(engine="nebula")
reader.query("""
    MATCH ()-[e:follow]->()
    RETURN e.src, e.dst, e.degree LIMIT 100000
""")
df = reader.read() # this will take some time
df.show(10)

# read data with nebula engine, scan mode
reader = NebulaReader(engine="nebula")
reader.scan(edge_types=["follow"])
df = reader.read() # this will take some time
df.show(10)

# convert dataframe to NebulaGraphObject
graph = reader.to_graph() # this will take some time
graph.nodes.show(10)
graph.edges.show(10)

# run pagerank algorithm
pr_result = graph.algo.pagerank(reset_prob=0.15, max_iter=10) # this will take some time

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ngdi-0.1.0.tar.gz (10.9 kB view details)

Uploaded Source

Built Distribution

ngdi-0.1.0-py3-none-any.whl (12.9 kB view details)

Uploaded Python 3

File details

Details for the file ngdi-0.1.0.tar.gz.

File metadata

  • Download URL: ngdi-0.1.0.tar.gz
  • Upload date:
  • Size: 10.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.4.6 CPython/3.8.10

File hashes

Hashes for ngdi-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c0660823d6f729dc5733eea78769f25c0c495306e62de8e999ffa5ac1c33610c
MD5 81a8c6811cefd78f5a05444bc24c24c8
BLAKE2b-256 ddd98381764ea619032ab39e2a539a012ba7af5a60d47ae77b41f302748aa1c5

See more details on using hashes here.

File details

Details for the file ngdi-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ngdi-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 12.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.4.6 CPython/3.8.10

File hashes

Hashes for ngdi-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f3a8f384c4283c65dc83934285e93e72b07a0d90fd6e812d0fcf53c794adff87
MD5 07b8bfbb4a0b7749498714f9686325bf
BLAKE2b-256 5607ddafcdd6502a0894377f8db26ccf9b4f258fba3a12b562bd0f80ebfae37e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page