Skip to main content

AWS GraphRAG Toolkit, lexical graph

Project description

Lexical Graph

The lexical-graph package provides a framework for automating the construction of a hierarchical lexical graph from unstructured data, and composing question-answering strategies that query this graph when answering user questions.

Features

Installation

The lexical-graph requires Python 3.10 or greater and pip.

Install the latest stable release from PyPI:

$ pip install graphrag-lexical-graph

To install a specific version from PyPI:

$ pip install graphrag-lexical-graph==3.18.5

Or install from a release zip file:

$ pip install https://github.com/awslabs/graphrag-toolkit/archive/refs/tags/graphrag-lexical-graph/v3.18.5.zip#subdirectory=lexical-graph

If you're running on AWS, you must run your application in an AWS region containing the Amazon Bedrock foundation models used by the lexical graph (see the configuration section in the documentation for details on the default models used), and must enable access to these models before running any part of the solution.

Additional dependencies

You will need to install additional dependencies for specific graph and vector store backends:

Amazon OpenSearch Serverless

$ pip install opensearch-py llama-index-vector-stores-opensearch

Postgres with pgvector

$ pip install psycopg2-binary pgvector

Neo4j

$ pip install neo4j

Connection strings

Pass a connection string to GraphStoreFactory.for_graph_store() or VectorStoreFactory.for_vector_store() to select a backend:

Store Connection string
Neptune Analytics (graph) neptune-graph://<graph-id>
Neptune Database (graph) neptune-db://<hostname> or any hostname ending .neptune.amazonaws.com
Neo4j (graph) bolt://, bolt+ssc://, bolt+s://, neo4j://, neo4j+ssc://, or neo4j+s:// URLs
OpenSearch Serverless (vector) aoss://<url>
Neptune Analytics (vector) neptune-graph://<graph-id>
pgvector (vector) constructed via PGVectorIndexFactory
S3 Vectors (vector) constructed via S3VectorIndexFactory
Dummy / no-op None or any unrecognised string — falls back to DummyGraphStore / DummyVectorIndex

Example of use

Indexing

from graphrag_toolkit.lexical_graph import LexicalGraphIndex
from graphrag_toolkit.lexical_graph.storage import GraphStoreFactory
from graphrag_toolkit.lexical_graph.storage import VectorStoreFactory

# requires pip install llama-index-readers-web
from llama_index.readers.web import SimpleWebPageReader

def run_extract_and_build():

    with (
        GraphStoreFactory.for_graph_store(
            'neptune-db://my-graph.cluster-abcdefghijkl.us-east-1.neptune.amazonaws.com'
        ) as graph_store,
        VectorStoreFactory.for_vector_store(
            'aoss://https://abcdefghijkl.us-east-1.aoss.amazonaws.com'
        ) as vector_store
    ):

        graph_index = LexicalGraphIndex(
            graph_store,
            vector_store
        )

        doc_urls = [
            'https://docs.aws.amazon.com/neptune/latest/userguide/intro.html',
            'https://docs.aws.amazon.com/neptune-analytics/latest/userguide/what-is-neptune-analytics.html',
            'https://docs.aws.amazon.com/neptune-analytics/latest/userguide/neptune-analytics-features.html',
            'https://docs.aws.amazon.com/neptune-analytics/latest/userguide/neptune-analytics-vs-neptune-database.html'
        ]

        docs = SimpleWebPageReader(
            html_to_text=True,
            metadata_fn=lambda url:{'url': url}
        ).load_data(doc_urls)

        graph_index.extract_and_build(docs, show_progress=True)

if __name__ == '__main__':
    run_extract_and_build()

Querying

from graphrag_toolkit.lexical_graph import LexicalGraphQueryEngine
from graphrag_toolkit.lexical_graph.storage import GraphStoreFactory
from graphrag_toolkit.lexical_graph.storage import VectorStoreFactory

def run_query():

    with (
        GraphStoreFactory.for_graph_store(
            'neptune-db://my-graph.cluster-abcdefghijkl.us-east-1.neptune.amazonaws.com'
        ) as graph_store,
        VectorStoreFactory.for_vector_store(
            'aoss://https://abcdefghijkl.us-east-1.aoss.amazonaws.com'
        ) as vector_store
    ):

        query_engine = LexicalGraphQueryEngine.for_traversal_based_search(
            graph_store,
            vector_store
        )

        response = query_engine.query('''What are the differences between Neptune Database
                                         and Neptune Analytics?''')

        print(response.response)

if __name__ == '__main__':
    run_query()

Documentation

Release

Release instructions are found in the RELEASE.md

License

This project is licensed under the Apache-2.0 License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphrag_lexical_graph-3.18.5.tar.gz (462.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graphrag_lexical_graph-3.18.5-py3-none-any.whl (483.2 kB view details)

Uploaded Python 3

File details

Details for the file graphrag_lexical_graph-3.18.5.tar.gz.

File metadata

  • Download URL: graphrag_lexical_graph-3.18.5.tar.gz
  • Upload date:
  • Size: 462.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for graphrag_lexical_graph-3.18.5.tar.gz
Algorithm Hash digest
SHA256 beba3d5c0d02af3df0870e0f869eca230b8d21f6a73fd827c8b2a49bcf5318d5
MD5 0da1be4b7dde57ebc114a399b45d5856
BLAKE2b-256 83e7a6d7aff99b50ad4aeeb9823a8c0d79d007327959527bdb5f7b93fe629b78

See more details on using hashes here.

Provenance

The following attestation bundles were made for graphrag_lexical_graph-3.18.5.tar.gz:

Publisher: lexical-graph-release.yml on awslabs/graphrag-toolkit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file graphrag_lexical_graph-3.18.5-py3-none-any.whl.

File metadata

File hashes

Hashes for graphrag_lexical_graph-3.18.5-py3-none-any.whl
Algorithm Hash digest
SHA256 da5411bd03b3f1bd30aef016a11bead679a1dc5bdfd5f535ec61ce6216926c45
MD5 e8aeec8aa0ef2a90827cf3b7fa8d5373
BLAKE2b-256 fa5e20871b4221527d3604026018f70fbf696cc76894cceb8aa2628bfdd82c7a

See more details on using hashes here.

Provenance

The following attestation bundles were made for graphrag_lexical_graph-3.18.5-py3-none-any.whl:

Publisher: lexical-graph-release.yml on awslabs/graphrag-toolkit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page