Skip to main content

AWS GraphRAG Toolkit, lexical graph

Project description

Lexical Graph

The lexical-graph package provides a framework for automating the construction of a hierarchical lexical graph from unstructured data, and composing question-answering strategies that query this graph when answering user questions.

Features

Installation

The lexical-graph requires Python 3.10 or greater and pip.

Install the latest stable release from PyPI:

$ pip install graphrag-lexical-graph

To install a specific version from PyPI:

$ pip install graphrag-lexical-graph==3.18.3

Or install from a release zip file:

$ pip install https://github.com/awslabs/graphrag-toolkit/archive/refs/tags/graphrag-lexical-graph/v3.18.3.zip#subdirectory=lexical-graph

If you're running on AWS, you must run your application in an AWS region containing the Amazon Bedrock foundation models used by the lexical graph (see the configuration section in the documentation for details on the default models used), and must enable access to these models before running any part of the solution.

Additional dependencies

You will need to install additional dependencies for specific graph and vector store backends:

Amazon OpenSearch Serverless

$ pip install opensearch-py llama-index-vector-stores-opensearch

Postgres with pgvector

$ pip install psycopg2-binary pgvector

Neo4j

$ pip install neo4j

Connection strings

Pass a connection string to GraphStoreFactory.for_graph_store() or VectorStoreFactory.for_vector_store() to select a backend:

Store Connection string
Neptune Analytics (graph) neptune-graph://<graph-id>
Neptune Database (graph) neptune-db://<hostname> or any hostname ending .neptune.amazonaws.com
Neo4j (graph) bolt://, bolt+ssc://, bolt+s://, neo4j://, neo4j+ssc://, or neo4j+s:// URLs
OpenSearch Serverless (vector) aoss://<url>
Neptune Analytics (vector) neptune-graph://<graph-id>
pgvector (vector) constructed via PGVectorIndexFactory
S3 Vectors (vector) constructed via S3VectorIndexFactory
Dummy / no-op None or any unrecognised string — falls back to DummyGraphStore / DummyVectorIndex

Example of use

Indexing

from graphrag_toolkit.lexical_graph import LexicalGraphIndex
from graphrag_toolkit.lexical_graph.storage import GraphStoreFactory
from graphrag_toolkit.lexical_graph.storage import VectorStoreFactory

# requires pip install llama-index-readers-web
from llama_index.readers.web import SimpleWebPageReader

def run_extract_and_build():

    with (
        GraphStoreFactory.for_graph_store(
            'neptune-db://my-graph.cluster-abcdefghijkl.us-east-1.neptune.amazonaws.com'
        ) as graph_store,
        VectorStoreFactory.for_vector_store(
            'aoss://https://abcdefghijkl.us-east-1.aoss.amazonaws.com'
        ) as vector_store
    ):

        graph_index = LexicalGraphIndex(
            graph_store,
            vector_store
        )

        doc_urls = [
            'https://docs.aws.amazon.com/neptune/latest/userguide/intro.html',
            'https://docs.aws.amazon.com/neptune-analytics/latest/userguide/what-is-neptune-analytics.html',
            'https://docs.aws.amazon.com/neptune-analytics/latest/userguide/neptune-analytics-features.html',
            'https://docs.aws.amazon.com/neptune-analytics/latest/userguide/neptune-analytics-vs-neptune-database.html'
        ]

        docs = SimpleWebPageReader(
            html_to_text=True,
            metadata_fn=lambda url:{'url': url}
        ).load_data(doc_urls)

        graph_index.extract_and_build(docs, show_progress=True)

if __name__ == '__main__':
    run_extract_and_build()

Querying

from graphrag_toolkit.lexical_graph import LexicalGraphQueryEngine
from graphrag_toolkit.lexical_graph.storage import GraphStoreFactory
from graphrag_toolkit.lexical_graph.storage import VectorStoreFactory

def run_query():

    with (
        GraphStoreFactory.for_graph_store(
            'neptune-db://my-graph.cluster-abcdefghijkl.us-east-1.neptune.amazonaws.com'
        ) as graph_store,
        VectorStoreFactory.for_vector_store(
            'aoss://https://abcdefghijkl.us-east-1.aoss.amazonaws.com'
        ) as vector_store
    ):

        query_engine = LexicalGraphQueryEngine.for_traversal_based_search(
            graph_store,
            vector_store
        )

        response = query_engine.query('''What are the differences between Neptune Database
                                         and Neptune Analytics?''')

        print(response.response)

if __name__ == '__main__':
    run_query()

Documentation

Release

Release instructions are found in the RELEASE.md

License

This project is licensed under the Apache-2.0 License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphrag_lexical_graph-3.18.3.tar.gz (423.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graphrag_lexical_graph-3.18.3-py3-none-any.whl (480.1 kB view details)

Uploaded Python 3

File details

Details for the file graphrag_lexical_graph-3.18.3.tar.gz.

File metadata

  • Download URL: graphrag_lexical_graph-3.18.3.tar.gz
  • Upload date:
  • Size: 423.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for graphrag_lexical_graph-3.18.3.tar.gz
Algorithm Hash digest
SHA256 62139d60c4c58bf1f0b99f023dd9255ed598fcfeadab6dfbb25c77152fdac283
MD5 bd7c95006b42cc2e324396e6c4ccbe6f
BLAKE2b-256 e8a44d2ccebb4a26b936b6d5b7b7a934fb7e6152067d81a94d7d8f8bcde0a0ad

See more details on using hashes here.

Provenance

The following attestation bundles were made for graphrag_lexical_graph-3.18.3.tar.gz:

Publisher: lexical-graph-release.yml on awslabs/graphrag-toolkit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file graphrag_lexical_graph-3.18.3-py3-none-any.whl.

File metadata

File hashes

Hashes for graphrag_lexical_graph-3.18.3-py3-none-any.whl
Algorithm Hash digest
SHA256 a5434f5e8c889eb2d1903782048735c7aac0879b697219c3bc2fcd2a8e545362
MD5 27acca61f886e9f0d192c87ca79b9189
BLAKE2b-256 69365a40f93f20c9c7524fdaa61cdbac7e3dea5a5da036282e3e44e7cd059438

See more details on using hashes here.

Provenance

The following attestation bundles were made for graphrag_lexical_graph-3.18.3-py3-none-any.whl:

Publisher: lexical-graph-release.yml on awslabs/graphrag-toolkit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page