Skip to main content

AWS GraphRAG Toolkit, lexical graph

Project description

Lexical Graph

The lexical-graph package provides a framework for automating the construction of a hierarchical lexical graph from unstructured data, and composing question-answering strategies that query this graph when answering user questions.

Features

Installation

The lexical-graph requires Python 3.10 or greater and pip.

Install from the latest release tag:

$ pip install https://github.com/awslabs/graphrag-toolkit/archive/refs/tags/v3.16.2.zip#subdirectory=lexical-graph

Or install from the main branch to get the latest changes:

$ pip install https://github.com/awslabs/graphrag-toolkit/archive/refs/heads/main.zip#subdirectory=lexical-graph

If you're running on AWS, you must run your application in an AWS region containing the Amazon Bedrock foundation models used by the lexical graph (see the configuration section in the documentation for details on the default models used), and must enable access to these models before running any part of the solution.

Additional dependencies

You will need to install additional dependencies for specific graph and vector store backends:

Amazon OpenSearch Serverless

$ pip install opensearch-py llama-index-vector-stores-opensearch

Postgres with pgvector

$ pip install psycopg2-binary pgvector

Neo4j

$ pip install neo4j

Connection strings

Pass a connection string to GraphStoreFactory.for_graph_store() or VectorStoreFactory.for_vector_store() to select a backend:

Store Connection string
Neptune Analytics (graph) neptune-graph://<graph-id>
Neptune Database (graph) neptune-db://<hostname> or any hostname ending .neptune.amazonaws.com
Neo4j (graph) bolt://, bolt+ssc://, bolt+s://, neo4j://, neo4j+ssc://, or neo4j+s:// URLs
OpenSearch Serverless (vector) aoss://<url>
Neptune Analytics (vector) neptune-graph://<graph-id>
pgvector (vector) constructed via PGVectorIndexFactory
S3 Vectors (vector) constructed via S3VectorIndexFactory
Dummy / no-op None or any unrecognised string — falls back to DummyGraphStore / DummyVectorIndex

Example of use

Indexing

from graphrag_toolkit.lexical_graph import LexicalGraphIndex
from graphrag_toolkit.lexical_graph.storage import GraphStoreFactory
from graphrag_toolkit.lexical_graph.storage import VectorStoreFactory

# requires pip install llama-index-readers-web
from llama_index.readers.web import SimpleWebPageReader

def run_extract_and_build():

    with (
        GraphStoreFactory.for_graph_store(
            'neptune-db://my-graph.cluster-abcdefghijkl.us-east-1.neptune.amazonaws.com'
        ) as graph_store,
        VectorStoreFactory.for_vector_store(
            'aoss://https://abcdefghijkl.us-east-1.aoss.amazonaws.com'
        ) as vector_store
    ):

        graph_index = LexicalGraphIndex(
            graph_store,
            vector_store
        )

        doc_urls = [
            'https://docs.aws.amazon.com/neptune/latest/userguide/intro.html',
            'https://docs.aws.amazon.com/neptune-analytics/latest/userguide/what-is-neptune-analytics.html',
            'https://docs.aws.amazon.com/neptune-analytics/latest/userguide/neptune-analytics-features.html',
            'https://docs.aws.amazon.com/neptune-analytics/latest/userguide/neptune-analytics-vs-neptune-database.html'
        ]

        docs = SimpleWebPageReader(
            html_to_text=True,
            metadata_fn=lambda url:{'url': url}
        ).load_data(doc_urls)

        graph_index.extract_and_build(docs, show_progress=True)

if __name__ == '__main__':
    run_extract_and_build()

Querying

from graphrag_toolkit.lexical_graph import LexicalGraphQueryEngine
from graphrag_toolkit.lexical_graph.storage import GraphStoreFactory
from graphrag_toolkit.lexical_graph.storage import VectorStoreFactory

def run_query():

    with (
        GraphStoreFactory.for_graph_store(
            'neptune-db://my-graph.cluster-abcdefghijkl.us-east-1.neptune.amazonaws.com'
        ) as graph_store,
        VectorStoreFactory.for_vector_store(
            'aoss://https://abcdefghijkl.us-east-1.aoss.amazonaws.com'
        ) as vector_store
    ):

        query_engine = LexicalGraphQueryEngine.for_traversal_based_search(
            graph_store,
            vector_store
        )

        response = query_engine.query('''What are the differences between Neptune Database
                                         and Neptune Analytics?''')

        print(response.response)

if __name__ == '__main__':
    run_query()

Documentation

Release

Release instructions are found in the RELEASE.md

License

This project is licensed under the Apache-2.0 License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphrag_lexical_graph-3.18.2.tar.gz (372.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graphrag_lexical_graph-3.18.2-py3-none-any.whl (414.9 kB view details)

Uploaded Python 3

File details

Details for the file graphrag_lexical_graph-3.18.2.tar.gz.

File metadata

  • Download URL: graphrag_lexical_graph-3.18.2.tar.gz
  • Upload date:
  • Size: 372.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphrag_lexical_graph-3.18.2.tar.gz
Algorithm Hash digest
SHA256 1f74c4558962b4fd6f5dfee35fa1f70938b357630c9f3ea33cb21a6d3e1c4173
MD5 a9cfb0a885861dbaba58f787b1ef4542
BLAKE2b-256 d6041e69fe454631210ca45fa32c551c8d56e660ac4bb496c0cb94d230cdaf99

See more details on using hashes here.

Provenance

The following attestation bundles were made for graphrag_lexical_graph-3.18.2.tar.gz:

Publisher: lexical-graph-release.yml on awslabs/graphrag-toolkit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file graphrag_lexical_graph-3.18.2-py3-none-any.whl.

File metadata

File hashes

Hashes for graphrag_lexical_graph-3.18.2-py3-none-any.whl
Algorithm Hash digest
SHA256 4d87934148d99e92ed7bc09ad46a287cafb3af527b587fd55a0427733b691933
MD5 7de49cc77d0b0d777d8b0a8a7da712e2
BLAKE2b-256 394f43c53fbbecffc35af86cec6a0f4e33571ca2ed2e5706992c6f03afc0ff80

See more details on using hashes here.

Provenance

The following attestation bundles were made for graphrag_lexical_graph-3.18.2-py3-none-any.whl:

Publisher: lexical-graph-release.yml on awslabs/graphrag-toolkit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page