Skip to main content

jupyter notebook extension to connect to graph databases

Project description

Graph Notebook: easily query and visualize graphs

The graph notebook provides an easy way to interact with graph databases using Jupyter notebooks. Using this open-source Python package, you can connect to any graph database that supports the Apache TinkerPop, openCypher or the RDF SPARQL graph models. These databases could be running locally on your desktop or in the cloud. Graph databases can be used to explore a variety of use cases including knowledge graphs and identity graphs.

A colorful graph picture

Visualizing Gremlin queries:

Gremlin query and graph

Visualizing openCypher queries

openCypher query and graph

Visualizing SPARQL queries:

SPARL query and graph

Instructions for connecting to the following graph databases:

Endpoint Graph model Query language
Gremlin Server property graph Gremlin
Blazegraph RDF SPARQL
Amazon Neptune property graph or RDF Gremlin or SPARQL

We encourage others to contribute configurations they find useful. There is an additional-databases folder where more information can be found.

Features

Notebook cell 'magic' extensions in the IPython 3 kernel

%%sparql - Executes a SPARQL query against your configured database endpoint.

%%gremlin - Executes a Gremlin query against your database using web sockets. The results are similar to those a Gremlin console would return.

%%opencypher or %%oc Executes an openCypher query against your database.

%%graph_notebook_config - Sets the executing notebook's database configuration to the JSON payload provided in the cell body.

%%graph_notebook_vis_options - Sets the executing notebook's vis.js options to the JSON payload provided in the cell body.

%%neptune_ml - Set of commands to integrate with NeptuneML functionality. Documentation

TIP :point_right: There is syntax highlighting for both %%sparql and %%gremlin queries to help you structure your queries more easily.

Notebook line 'magic' extensions in the IPython 3 kernel

%gremlin_status - Obtain the status of Gremlin queries. Documentation

%sparql_status - Obtain the status of SPARQL queries. Documentation

%opencypher_status or %oc_status - Obtain the status of openCypher queries.

%load - Generate a form to submit a bulk loader job. Documentation

%load_ids - Get ids of bulk load jobs. Documentation

%load_status - Get the status of a provided load_id. Documentation

%neptune_ml - Set of commands to integrate with NeptuneML functionality. You can find a set of tutorial notebooks here. Documentation

%status - Check the Health Status of the configured host endpoint. Documentation

%seed - Provides a form to add data to your graph without the use of a bulk loader. Supports both RDF and Property Graph data models.

%graph_notebook_config - Returns a JSON payload that contains connection information for your host.

%graph_notebook_host - Set the host endpoint to send queries to.

%graph_notebook_version - Print the version of the graph-notebook package

%graph_notebook_vis_options - Print the Vis.js options being used for rendered graphs

TIP :point_right: You can list all the magics installed in the Python 3 kernel using the %lsmagic command.

TIP :point_right: Many of the magic commands support a --help option in order to provide additional information.

Example notebooks

This project includes many example Jupyter notebooks. It is recommended to explore them. All of the commands and features supported by graph-notebook are explained in detail with examples within the sample notebooks. You can find them here. As this project has evolved, many new features have been added. If you are already familiar with graph-notebook but want a quick summary of new features added, a good place to start is the Air-Routes notebooks in the 02-Visualization folder.

Keeping track of new features

It is recommended to check the ChangeLog.md file periodically to keep up to date as new features are added.

Prerequisites

You will need:

Installation

# pin specific versions of Jupyter and Tornado dependency
pip install notebook==5.7.10
pip install tornado==4.5.3

# install the package
pip install graph-notebook

# install and enable the visualization widget
jupyter nbextension install --py --sys-prefix graph_notebook.widgets
jupyter nbextension enable  --py --sys-prefix graph_notebook.widgets

# copy static html resources
python -m graph_notebook.static_resources.install
python -m graph_notebook.nbextensions.install

# copy premade starter notebooks
python -m graph_notebook.notebooks.install --destination ~/notebook/destination/dir  

# start jupyter
jupyter notebook ~/notebook/destination/dir

Connecting to a graph database

Gremlin Server

In a new cell in the Jupyter notebook, change the configuration using %%graph_notebook_config and modify the fields for host, port, and ssl. For a local Gremlin server (HTTP or WebSockets), you can use the following command:

%%graph_notebook_config
{
  "host": "localhost",
  "port": 8182,
  "auth_mode": "DEFAULT",
  "load_from_s3_arn": "",
  "ssl": false,
  "aws_region": "us-east-1"
}

To setup a new local Gremlin Server for use with the graph notebook, check out additional-databases/gremlin server

Blazegraph

Change the configuration using %%graph_notebook_config and modify the fields for host, port, and ssl. For a local Blazegraph database, you can use the following command:

%%graph_notebook_config
{
  "host": "localhost",
  "port": 9999,
  "auth_mode": "DEFAULT",
  "load_from_s3_arn": "",
  "ssl": false,
  "aws_region": "us-east-1"
}

You can also make use of namespaces for Blazegraph by specifying the path graph-notebook should use when querying your SPARQL like below:

%%graph_notebook_config

{
  "host": "localhost",
  "port": 9999,
  "auth_mode": "DEFAULT",
  "load_from_s3_arn": "",
  "ssl": false,
  "aws_region": "us-west-2",
  "sparql": {
    "path": "blazegraph/namespace/foo/sparql"
  }
}

This will result in the url localhost:9999/blazegraph/namespace/foo/sparql being used when executing any %%sparql magic commands.

To setup a new local Blazegraph database for use with the graph notebook, check out the Quick Start from Blazegraph.

Amazon Neptune

Change the configuration using %%graph_notebook_config and modify the defaults as they apply to your Neptune cluster:

%%graph_notebook_config
{
  "host": "your-neptune-endpoint",
  "port": 8182,
  "auth_mode": "DEFAULT",
  "load_from_s3_arn": "",
  "ssl": true,
  "aws_region": "your-neptune-region"
}

To setup a new Amazon Neptune cluster, check out the AWS documentation.

When connecting the graph notebook to Neptune, make sure you have a network setup to communicate to the VPC that Neptune runs on. If not, you can follow this guide.

Authentication (Amazon Neptune)

If you are running a SigV4 authenticated endpoint, ensure that your configuration has auth_mode set to IAM:

%%graph_notebook_config
{
  "host": "your-neptune-endpoint",
  "port": 8182,
  "auth_mode": "IAM",
  "load_from_s3_arn": "",
  "ssl": true,
  "aws_region": "your-neptune-region"
}

Additionally, you should have the following AWS credentials available in a location accessible to Boto3:

  • AWS_ACCESS_KEY_ID
  • AWS_SECRET_ACCESS_KEY
  • AWS_REGION
  • AWS_SESSION_TOKEN (OPTIONAL. Use if you are using temporary credentials)

A list of all locations checked for AWS credentials can be found in the Boto3 documentation.

Contributing Guidelines

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graph-notebook-3.0.1.tar.gz (15.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graph_notebook-3.0.1-py3-none-any.whl (13.4 MB view details)

Uploaded Python 3

File details

Details for the file graph-notebook-3.0.1.tar.gz.

File metadata

  • Download URL: graph-notebook-3.0.1.tar.gz
  • Upload date:
  • Size: 15.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.0 pkginfo/1.7.0 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.6.11

File hashes

Hashes for graph-notebook-3.0.1.tar.gz
Algorithm Hash digest
SHA256 8e5a552bcd19b4ba25365fa3a340a957027b158890bcef452c2071b39d5f12ff
MD5 9b4c8361a0391006c2260b4238b3424b
BLAKE2b-256 2a5774fdf94b91e66df15422298bece882de3b264404dc5ac300d9e390e2984d

See more details on using hashes here.

File details

Details for the file graph_notebook-3.0.1-py3-none-any.whl.

File metadata

  • Download URL: graph_notebook-3.0.1-py3-none-any.whl
  • Upload date:
  • Size: 13.4 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.0 pkginfo/1.7.0 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.6.11

File hashes

Hashes for graph_notebook-3.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 26fa9bd2474137d620ea7b39629ac64854d9e400f8c5ef52a860aa8c89d47927
MD5 8af96212b99e3ddae8c52280b173793c
BLAKE2b-256 2369494309cf9129633acd71c93670df89a171f064370a1df350c44ac0535749

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page