Skip to main content

A Python package for building powerful end-to-end agentic GraphRAG systems with a simple, intuitive API

Project description

Graph-ND

Knowledge in Graphs, Not Documents

Graph-nd is a Python project for building powerful end-to-end agentic GraphRAG systems with a simple, intuitive API.

Example Usage: From data sources to agentic GraphRAG in 3 steps:

# Instantiate graphrag
graphrag = GraphRAG(db_client, llm, embedding_model)

# 1) Get the graph schema. Can also define exactly via json/pydantic spec
graphrag.schema.infer("a simple graph of hardware components "
                     "where components (with id, name, and description properties)  "
                     "can be types of or inputs to other components.")

# 2) Merge data into your graph store - both structured (tables) and unstructured (docs)
graphrag.data.merge_csvs(['component-types.csv', 'component-input-output.csv'])
graphrag.data.merge_pdf('component-catalog.pdf') 

# 3) GraphRAG agent for better answers.
graphrag.agent("what sequence of components depend on silicon wafers?")

""" Output:
================================ Human Message =================================
what sequence of components depend on silicon wafers?
================================== Ai Message ==================================
Tool Calls:
 node_search (call_mhh8tLl0WbvwXxEyrS7zBxKN)
 Args:
   search_query: silicon wafers
   top_k: 5
   search_config: {'search_type': 'SEMANTIC', 'node_label': 'Component', 'search_prop': 'name'}
================================= Tool Message =================================
Name: node_search
[   {  
       "id": "N26",
       "name": "Wafer",
       "description": "Silicon wafers are the basic building block....
================================== Ai Message ==================================
Tool Calls:
 query (call_jirfr5ZDmKTL2TQRp7xx1VYJ)
 Args:
   query_instructions: Find the sequence of components that depend on the component with id 'N26' (Silicon Wafer) through 'INPUT_TO' relationships. Return the sequence of component names.
Running Query:
MATCH (start:Component {id: 'N26'})-[:INPUT_TO*]->(end:Component)
RETURN end.name AS componentName
================================= Tool Message =================================
Name: query
[   
   {"componentName": "Deposition"},
   {"componentName": "Photolithography"}, 
   ...
================================== Ai Message ==================================
The sequence of components that depend on silicon wafers is as follows:
1. Deposition
2. Photolithography
3. Etch and clean
4. Chemical mechanical planarization
5. Assembly and packaging
6. Testing
7. Finished logic chip
"""

Why Graph-ND?

  1. Designed to get you started with GraphRAG easily in 5 minutes. No prior graph expertise required!
  2. Built with intent to extend to production - not just a demo tool. While geared for simplicity, users can customize schemas, data loading, indexes, etc. for precision & control.
  3. Prioritizes support for mixed data. Seamlessly integrates both structured (CSV, tables) and unstructured data (PDFs, text) into your knowledge graph.

How It Works in More Detail

Here’s a step-by-step example to using graph-nd:

  1. Setup: Instantiate and configure the GraphRAG class. GraphRAG uses Langchain under-the-hood so you can use any model(s) with Langchain support.
from graph_nd import GraphRAG
from neo4j import GraphDatabase
from langchain_openai import OpenAIEmbeddings, ChatOpenAI

db_client = GraphDatabase.driver(uri, auth=(username, password))  # Neo4j connection
embedding_model = OpenAIEmbeddings(model='text-embedding-ada-002')  # Embeddings
llm = ChatOpenAI(model="gpt-4o", temperature=0.0)  # Language model

graphrag = GraphRAG(db_client, llm, embedding_model)
  1. Get the Graph Schema: When experimenting, you can define the desired graph structure using natural language and GraphRAG will infer the schema automatically. When you need more precision, you can use the schema.define method to specify the schema exactly passing a Pydantic GraphSchema object. You can also .export & .load the schema to/from json files allowing you to iterate and version control the schema.
graphrag.schema.infer("""
   A simple graph of hardware components where components 
   (with id, name, and description properties) can be types of or inputs to other components.
   """)
  1. Merge Data into the Graph: Merge both structured (e.g., CSV) and unstructured (e.g., PDFs) data. The data.merge_csvs, data.merge_pdf and data.merge_text methods use LLMs to automatically map data to your graph following the graph schema. For cases where you need to control the mapping yourself (instead of relying on the LLM in GraphRAG), you can format your own node and relationship dict records and merge directly via the data.merge_nodes and data.merge_relationships methods.
graphrag.data.merge_csvs(['component-types.csv', 'component-input-output.csv'])  # Structured data
graphrag.data.merge_pdf('component-catalog.pdf')  # Unstructured data
  1. Answer Questions with the Auto-Configured Agent: The agent includes advanced tools for node search (full-text and semantic), graph traversals (multi-hops, paths, etc.), and aggregation queries. These are autoconfigured based on the graph schema. For advanced use cases, graphrag.schema.prompt_str() serializes the graph schema with simplified query patterns. You can use this as a prompt parameter when creating your own custom chains and agent workflows.
# Example queries
graphrag.agent("What sequence of components depend on silicon wafers?")
graphrag.agent("Can you describe what GPUs do?")
graphrag.agent("What components have the most inputs?")

Installing and Running Graph-ND

Installation

pip install graph-nd

Alternatively, for development you can clone and install locally:

  1. Clone the repository:

    git clone https://github.com/zach-blumenfeld/graph-nd.git
    cd graph-nd
    
  2. Install with Poetry:

    # Install Poetry if you haven't already
    # curl -sSL https://install.python-poetry.org | python3 -
    
    # Install dependencies
    poetry install
    
    # Activate the virtual environment
    poetry shell
    

Configuration

  1. Start a free Neo4j (Aura) instance at console.neo4j.io/

  2. Configure your .env file with the following:

    NEO4J_URI=<your_neo4j_uri>
    NEO4J_USERNAME=<your_neo4j_username>
    NEO4J_PASSWORD=<your_neo4j_password>
    
    OPEN_AI_API_KEY = ... # or substitute your preferred LLM/Embedding provider(s)
    

Getting Started

Explore our example notebooks to learn how to use graph-nd:

Feedback & Contributions

We welcome your feedback and contributions to make GraphRAG better and more accessible for everyone!

If you'd like to contribute:

  1. Fork the repository and create a new branch for your feature or fix
  2. Squash your commits for a clean history
  3. Ensure all unit tests pass (running functional and integration tests is also highly recommended)
  4. Submit a pull request with a clear description of your changes

For bugs, feature requests, or questions, please open an issue on our GitHub repository.

Running Tests

This project uses pytest for testing. Tests are organized in three categories:

  • Unit tests: Basic component testing
  • Integration tests: Testing interaction between components - you will need an Aura free instance and OpenAI key configured in a .env file.
  • Functional tests: More comprehensive End-to-end testing of features - you will need an Aura free instance and OpenAI key configured in a .env file.

Running Tests with Poetry

# Run all tests
poetry run pytest tests -v

# Run only unit tests
poetry run pytest tests/unit -v

# Run only integration tests
poetry run pytest tests/integration -v

# Run only functional tests
poetry run pytest tests/functional -v

# Run a specific test file
poetry run pytest tests/unit/test_specific_file.py -v

# Run a specific test function
poetry run pytest tests/unit/test_file.py::test_function -v

# Run tests with specific pattern matching
poetry run pytest tests/unit -k "pattern" -v

Additional pytest options:

  • -v: Verbose output
  • -k "pattern": Only run tests matching the pattern
  • --tb=native: Display full traceback
  • -x: Stop after first failure

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graph_nd-0.0.1a0.tar.gz (43.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graph_nd-0.0.1a0-py3-none-any.whl (45.3 kB view details)

Uploaded Python 3

File details

Details for the file graph_nd-0.0.1a0.tar.gz.

File metadata

  • Download URL: graph_nd-0.0.1a0.tar.gz
  • Upload date:
  • Size: 43.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.11.6 Darwin/24.5.0

File hashes

Hashes for graph_nd-0.0.1a0.tar.gz
Algorithm Hash digest
SHA256 12244dcfcc36d0779b3800ce3a23b647658a9750e120d2ed8d7c964e3f229c1a
MD5 c3817f753e6a15c0b02aedcb0dd7158f
BLAKE2b-256 ee270902b61a2810278be0928e596f4561d9413e861120fa6d15ccaa6bd36408

See more details on using hashes here.

File details

Details for the file graph_nd-0.0.1a0-py3-none-any.whl.

File metadata

  • Download URL: graph_nd-0.0.1a0-py3-none-any.whl
  • Upload date:
  • Size: 45.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.11.6 Darwin/24.5.0

File hashes

Hashes for graph_nd-0.0.1a0-py3-none-any.whl
Algorithm Hash digest
SHA256 1909dc9724f6c2287e07b9ac09430d41f8c3a8547fd003286bab201f8676b4f2
MD5 41f7e3a7e155adde9ab36537dad5d9bf
BLAKE2b-256 c6ef1d3fb0baacbab4eea0ac66efe848e9e99cd2b591105b97ce8f067289c9cb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page