Graph analytics for data lake sources using Neptune Analytics
Project description
nx_neptune
nx-neptune is a Python library that brings graph analytics to your data lake. Project your data from Amazon S3 Tables, S3 Vectors, Databricks, Snowflake, OpenSearch, and other sources into Neptune Analytics for graph analysis — with results exported back to S3 or persisted as Iceberg tables.
Graph Over Data Lake
Run graph algorithms on data that lives in your data lake — without moving it permanently into a graph database. nx-neptune projects your data into a Neptune Analytics graph using SQL queries (via Amazon Athena), so you can run graph algorithms, explore relationships with openCypher queries, and visualize connections that are invisible in tabular form. When you're done, export the results back to S3 or destroy the graph.
What previously required complex ETL pipelines to get data into a graph database is now a streamlined workflow: define your Athena query, give the module the right permissions, and it handles the projection for you.
Supported data sources:
- Amazon S3 Tables — query Iceberg tables via Athena SQL (demo)
- Amazon S3 Vectors — project vector embeddings via a custom Athena connector (connector, demo)
- Databricks Unity Catalog — query Databricks tables via a JDBC-based Athena connector (connector, demo)
- Snowflake — query tables via the Athena Snowflake connector (demo)
- Amazon OpenSearch — project embeddings from OpenSearch indices (demo)
- Any source accessible through Athena federated queries — 25+ connectors available
For data already in S3-compatible formats (CSV, Parquet), Neptune Analytics also supports native S3 import without Athena.
Use cases demonstrated in the notebooks:
- Fraud detection — project financial transactions as a graph, run community detection (Louvain) to identify fraud rings (S3 Tables demo, Databricks demo)
- Product recommendation — project product catalogs with vector embeddings, run similarity search to find related items (S3 Vectors demo, OpenSearch demo)
Vector embeddings are a natural add-on to graph analytics — import them alongside your graph data to combine structural traversal with semantic similarity search.
Session management:
The SessionManager API manages the full lifecycle of a Neptune Analytics graph: create, import, analyze, export, and destroy. See the session manager demo and instance lifecycle demo.
NetworkX Backend
nx-neptune also serves as a NetworkX-compatible backend for Neptune Analytics, enabling you to offload graph algorithm workloads to AWS with no code changes. Use familiar NetworkX APIs to seamlessly scale graph computations on-demand. This combines the simplicity of local development with the performance and scalability of a fully managed AWS graph analytics service. For more on NetworkX backends, see the NetworkX backends documentation.
import networkx as nx
G = nx.Graph()
G.add_edge("Bill", "John")
r = nx.pagerank(G, backend="neptune")
Supported Algorithms
For details of all supported NetworkX algorithms see algorithms.md
Preview Status
This project is in Alpha Preview. We welcome questions, suggestions, and contributions. It is recommended for testing purposes only — we're tracking production readiness on the roadmap.
Installation
Install it from PyPI
pip install nx_neptune
Build and install from package wheel
# Package the project from source:
python -m pip wheel -w dist .
# Install with Jupyter dependencies from wheel:
pip install "dist/nx_neptune-0.6.0-py3-none-any.whl"
Install from source
git clone git@github.com:awslabs/nx-neptune.git
cd nx-neptune
# install from source directly
make install
Dependencies are pinned in lock files (requirements.txt, requirements-dev.txt, requirements-jupyter.txt) generated by pip-tools. If you update pyproject.toml, regenerate them with:
make lock
Note:
make lockmust be run with a Python 3.11 interpreter to match CI. If your default Python is a different version, create a 3.11 venv:python3.11 -m venv .venv-lock source .venv-lock/bin/activate pip install pip-tools make lockCI only verifies
requirements.txtandrequirements-dev.txt. The jupyter lock file (requirements-jupyter.txt) is not checked because it may contain platform-specific packages (e.g.,appnopeon macOS) that differ between local and CI environments.
Prerequisite
Before using this backend, ensure the following prerequisites are met:
AWS IAM Permissions
The IAM role or user accessing Neptune Analytics must have the following permissions:
These permissions are required to read, write, and manage graph data via queries on Neptune Analytics:
neptune-graph:ReadDataViaQueryneptune-graph:WriteDataViaQueryneptune-graph:DeleteDataViaQuery
These permissions are required to start/stop a Neptune Analytics graph:
neptune-graph:StartGraphneptune-graph:StopGraph
These permissions are required to save/restore a Neptune Analytics snapshot:
neptune-graph:CreateGraphSnapshot(for save)neptune-graph:RestoreGraphFromSnapshot(for restore)neptune-graph:DeleteGraphSnapshot(for delete)neptune-graph:TagResource
These permissions are required to import/export between S3 and Neptune Analytics:
s3:GetObject(for import)s3:PutObject(for export)s3:ListBucket(for export)s3:DeleteBucket(for delete)kms:Decryptkms:GenerateDataKeykms:DescribeKey
In Addition to the S3 import/export permissions, to read from/write to an existing S3 Tables datalake:
athena:StartQueryExecutionathena:GetQueryExecution
The ARN with the above permissions must be added to your environment variables
Python Runtime
- Python 3.11 is required.
- Ensure your environment uses Python 3.11 to maintain compatibility with dependencies and API integrations.
Note: As part of the preview status, we are recommending that you run the library using Python 3.11.
Usage
import networkx as nx
G = nx.Graph()
G.add_node("Bill")
G.add_node("John")
G.add_edge("Bill", "John")
r = nx.pagerank(G, backend="neptune")
And run with:
# Set the NETWORKX_GRAPH_ID environment variable
export NETWORKX_GRAPH_ID=your-neptune-analytics-graph-id
python ./nx_example.py
Alternatively, you can pass the NETWORKX_GRAPH_ID directly:
NETWORKX_GRAPH_ID=your-neptune-analytics-graph-id python ./nx_example.py
Without a valid NETWORKX_GRAPH_ID, the examples will fail to connect to your Neptune
Analytics instance. Make sure your AWS credentials are properly configured and
your IAM role/user has the required permissions (ReadDataViaQuery,
WriteDataViaQuery, DeleteDataViaQuery).
Running tests
Unit tests can be run with make, this runs all tests in the test folder:
make test
Integration tests are included in the integ_test folder and run examples against an existing instance of Neptune
Analytics, by passing the graph identifier available in the AWS account.
export NETWORKX_GRAPH_ID=g-test12345
make integ-test
You can set BACKEND=False to run the test suite using NetworkX without nx-neptune as the backend.
CloudFormation Deployment
A CloudFormation template is provided to deploy a complete Neptune Analytics + SageMaker notebook environment with a single command. The stack creates a Neptune Analytics graph, a SageMaker notebook instance with nx_neptune pre-installed, an S3 staging bucket with KMS encryption, and all required IAM permissions.
Quick deploy
By default, the stack installs nx_neptune from PyPI:
./cloudformation-templates/deploy.sh # defaults: nx-neptune-demo, us-west-1
./cloudformation-templates/deploy.sh my-stack us-east-1 # custom stack name and region
To deploy with a locally built wheel instead, pass true as the third argument:
./cloudformation-templates/deploy.sh nx-neptune-demo us-west-1 true
Teardown
./cloudformation-templates/teardown.sh # defaults: nx-neptune-demo, us-west-1
./cloudformation-templates/teardown.sh my-stack us-east-1
For full parameter reference, manual deploy steps, and environment variable details, see cloudformation-templates/README.md.
Jupyter Notebook Integration
For interactive exploration and visualization, you can use the Jupyter notebook integration. To deploy a pre-configured SageMaker notebook environment, see CloudFormation Deployment above.
Notebooks
The notebooks directory contains interactive demonstrations:
Data lake integration:
- import_s3_table_demo.ipynb: Project S3 Tables data into a graph, run Louvain, export results back to Iceberg
- import_s3_vector_embedding_demo.ipynb: Project S3 Vector embeddings via Athena federated query
- import_databricks_demo.ipynb: Project Databricks tables via Athena federated query
- import_snowflake_table_demo.ipynb: Project Snowflake tables via Athena federated query
- import_open_search_embedding_demo.ipynb: Project OpenSearch embeddings via Athena federated query
- s3_import_export_demo.ipynb: Import from and export to S3
Session and lifecycle management:
- session_manager_comprehensive_demo.ipynb: SessionManager API — create, import, analyze, export, destroy
- instance_mgmt_lifecycle_demo.ipynb: Explicit instance lifecycle management
- instance_mgmt_with_configuration.ipynb: Configuration-based instance management
Algorithm demos:
- pagerank_demo.ipynb: Focused demonstration of the PageRank algorithm
- bfs_demo.ipynb: Demonstration of Breadth-First Search traversal
- degree_demo.ipynb: Demonstration of Degree Centrality algorithm
- closeness_centrality_demo.ipynb: Focused demonstration of the Closeness Centrality algorithm
- louvain_demo.ipynb: Demonstration of Louvain algorithm
- label_propagation_demo.ipynb: Demonstration of Label Propagation algorithm
Running locally
A full tutorial is available to run in Neptune Jupyter Notebooks.
To install the required dependencies for the Jupyter notebook (including the Jupyter dependencies):
pip install "nx_neptune[jupyter]"
To run the Jupyter notebooks:
-
Set your Neptune Analytics Graph ID as an environment variable:
export NETWORKX_GRAPH_ID=your-neptune-analytics-graph-id
-
You will also need to specify the IAM roles that will execute S3 import or export:
export NETWORKX_ARN_IAM_ROLE=arn:aws:iam::AWS_ACCOUNT:role/IAM_ROLE_NAME export NETWORKX_S3_IMPORT_BUCKET_PATH=s3://S3_BUCKET_PATH export NETWORKX_S3_EXPORT_BUCKET_PATH=s3://S3_BUCKET_PATH
-
Launch Jupyter Notebook:
jupyter notebook notebooks/
Security
See CONTRIBUTING for more information.
License
This project is licensed under the Apache-2.0 License.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nx_neptune-0.6.0-py3-none-any.whl.
File metadata
- Download URL: nx_neptune-0.6.0-py3-none-any.whl
- Upload date:
- Size: 90.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
91166e8203a93dbef60bdcf5136bf86e9e2f2b13422d0455cdbf9ce7f4e832a2
|
|
| MD5 |
aeea16443de0eb623e0bcce94f8b4da6
|
|
| BLAKE2b-256 |
60854d536866c0657f01126192c12761515a80eb423dde8e4cb7fffb1f1e9c94
|
Provenance
The following attestation bundles were made for nx_neptune-0.6.0-py3-none-any.whl:
Publisher:
python-publish.yml on awslabs/nx-neptune
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nx_neptune-0.6.0-py3-none-any.whl -
Subject digest:
91166e8203a93dbef60bdcf5136bf86e9e2f2b13422d0455cdbf9ce7f4e832a2 - Sigstore transparency entry: 1477018332
- Sigstore integration time:
-
Permalink:
awslabs/nx-neptune@591ffe3c5d3cb604c06a8577bf3aa33da5b77bfe -
Branch / Tag:
refs/tags/v0.6.0 - Owner: https://github.com/awslabs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@591ffe3c5d3cb604c06a8577bf3aa33da5b77bfe -
Trigger Event:
release
-
Statement type: