Skip to main content

NebulaGraph Data Intelligence Suite

Project description

ngdi

NebulaGraph Data Intelligence Suite

API

NebulaReader

ngdi.NebulaReader reads data from NebulaGraph and constructs NebulaDataFrameObject or NebulaGraphObject. It supports different engines, including Spark and NebulaGraph. The default engine is Spark.

Per each engine, ngdi.NebulaReader supports different read modes, including query, scan, and load. For now, Spark Engine supports query, scan and load modes, while NebulaGraph Engine supports query and scan modes.

In Spark Engine, all modes are underlyingly implemented by NebulaGraph Spark Connector nGQL Reader, while in the future, query mode will be optionally done by opencypher/morpheus to bypass the Nebula-GraphD.

The NebulaDataFrameObject returned by ngdi.NebulaReader.read() is a Spark DataFrame, which can be further processed by Spark SQL or Spark MLlib. And the graph returned by ngdi.NebulaReader.to_graph() is a GraphX Graph, which can be further processed by GraphX. In the future, the graph returned by ngdi.NebulaReader.to_graph() will support GraphFrame, too.

In NebulaGraph Engine, query mode is implemented by Python Nebula-GraphD Client, while scan mode the Nebula-StoreageD Client.

The NebulaDataFrameObject returned by ngdi.NebulaReader.read() is a Pandas DataFrame, which can be further processed by Pandas. And the graph returned by ngdi.NebulaReader.to_graph() is a NetworkX Graph, which can be further processed by NetworkX.

Functions

  • ngdi.NebulaReader.query() sets the query statement.
  • ngdi.NebulaReader.scan() sets the scan statement.
  • ngdi.NebulaReader.load() sets the load statement.
  • ngdi.NebulaReader.read() executes the read operation and returns a DataFrame or NebulaGraphObject.
  • ngdi.NebulaReader.to_graph() converts the DataFrame returned by ngdi.NebulaReader.read() to a NebulaGraphObject.
  • ngdi.NebulaReader.get_graph() returns the NebulaGraphObject.
  • ngdi.NebulaReader.get_dataframe() returns the DataFrame object.
  • ngdi.NebulaReader.show() shows the DataFrame returned by ngdi.NebulaReader.read().

Examples

  • Spark Engine Examples
from ngdi import NebulaReader

# read data with spark engine, query mode
reader = NebulaReader(engine="spark")
query = """
    MATCH ()-[e:follow]->()
    RETURN e LIMIT 100000
"""
reader.query(query=query, edge="follow", props="degree")
df = reader.read() # this will take some time
df.show(10)

# read data with spark engine, scan mode
reader = NebulaReader(engine="spark")
reader.scan(edge="follow", props="degree")
df = reader.read() # this will take some time
df.show(10)

# read data with spark engine, load mode
reader = NebulaReader(engine="spark")
reader.load(source="hdfs://path/to/edge.csv", format="csv", header=True, schema="src: string, dst: string, rank: int")
df = reader.read() # this will take some time
df.show(10)

# run connected components algorithm
cc_result = df.algo.connected_components() # this will take some time

# convert dataframe to NebulaGraphObject
graph = reader.to_graph() # this will take some time
graph.vertices.show(10)
graph.edges.show(10)

# run pagerank algorithm
pr_result = graph.algo.pagerank(reset_prob=0.15, max_iter=10) # this will take some time
  • NebulaGraph Engine Examples
from ngdi import NebulaReader

# read data with nebula engine, query mode
reader = NebulaReader(engine="nebula")
reader.query("""
    MATCH ()-[e:follow]->()
    RETURN e.src, e.dst, e.degree LIMIT 100000
""")
df = reader.read() # this will take some time
df.show(10)

# read data with nebula engine, scan mode
reader = NebulaReader(engine="nebula")
reader.scan(edge_types=["follow"])
df = reader.read() # this will take some time
df.show(10)

# convert dataframe to NebulaGraphObject
graph = reader.to_graph() # this will take some time
graph.nodes.show(10)
graph.edges.show(10)

# run pagerank algorithm
pr_result = graph.algo.pagerank(reset_prob=0.15, max_iter=10) # this will take some time

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ngdi-0.1.1.tar.gz (10.9 kB view details)

Uploaded Source

Built Distribution

ngdi-0.1.1-py3-none-any.whl (12.9 kB view details)

Uploaded Python 3

File details

Details for the file ngdi-0.1.1.tar.gz.

File metadata

  • Download URL: ngdi-0.1.1.tar.gz
  • Upload date:
  • Size: 10.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.4.6 CPython/3.8.10

File hashes

Hashes for ngdi-0.1.1.tar.gz
Algorithm Hash digest
SHA256 ef1987e24fb0f6f208341023f779f3a97c11e73db10bfdc6b2269942e5aeb084
MD5 19299a7de29c36bd64fe46d1cdb7c456
BLAKE2b-256 af8c77cd3e8cc175e99db43460d8dbfe6b02d98c5543d2f559cd830a8c491ca5

See more details on using hashes here.

File details

Details for the file ngdi-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: ngdi-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 12.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.4.6 CPython/3.8.10

File hashes

Hashes for ngdi-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d394304f284d7e1a12c5d0b442281196b041dbd99a4a717624c24c24ab892be4
MD5 dfad777cd63f3738027af8d8f0930cbf
BLAKE2b-256 7f34f196a216bb15224a9a8e88f7d0148187caf516aed0d37eb610e5e76dab91

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page