Implementation of GraphRAG (https://arxiv.org/pdf/2404.16130)
Project description
GraphRAG
** WORK IN PROGRESS **
This is an implementation of GraphRAG as described in
https://arxiv.org/pdf/2404.16130
From Local to Global: A Graph RAG Approach to Query-Focused Summarization
Official implementation by the authors of the paper is available at:
https://github.com/microsoft/graphrag/
Why re-implementation 🤔?
The primary reasons for re-implementing:
- Develop better understanding of the intricacies of the paper by implementing it
- Official implementation
- is not built upon popular frameworks like langchain, llamaIndex etc
- is bit difficult to understand because of reliance on
datashaper
package - does not support models other than OpenAI or AzureOpenAI
Install (Not Recommended yet!)
Note - this is work in progress so installing the package is not recommended yet. It would be better to clone the repo and try out current state of the code. See below for more details.
I published the package so as to reserve the name. Clone the repo and install the package locally.
pip install langchain-graphrag
Projects
There are 2 projects in the repo:
langchain_graphrag
This is the core library that implements the GraphRAG paper. It is built on top of the langchain
library.
The concepts described in GraphRAG paper are implemented in a modular fashion with easy extensibility and replacement in mind.
To use the development version (Recommended as it is under active development):
Clone the repo
git clone https://github.com/ksachdeva/langchain-graphrag.git
Open in VSCode devcontainer (Recommended)
Devcontainer will install all the dependencies
If not using devcontainer
Make sure you have rye
installed. See https://rye.astral.sh/
# sync all the dependencies
rye sync
examples/simple-app
This is a simple typer
based CLI app.
In terms of configuration it is limited by the number of command line options exposed.
That said, the way core library is written you can easily replace any component by your own implementation i.e. your choice of LLM, embedding models etc. Even some of the classes as long as they implement the required interface.
# To generate the index
# default set azure_openai/gpt4-o/text-embedding-3-small
# you can change the model and other parameters from command line
rye run simple-app-indexer
# To see more options
rye run simple-app-indexer --help
# To do global search/query
# defaults are azure_openai/gpt4-o/text-embedding-3-small
# you can change the model and other parameters from command line
rye run simple-app-global-search --query "What are the top themes in this story?"
# To do local search/query
# defaults are azure_openai/gpt4-o/text-embedding-3-small
# you can change the model and other parameters from command line
rye run simple-app-local-search --query "Who is Scrooge, and what are his main relationships?"
See examples/simple-app/README.md
for more details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for langchain_graphrag-0.0.2b6.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | ebd94993c51b52bce995426f8566ea95f2ef4d2da445958254fb5bc9fbe85fa1 |
|
MD5 | 1d77dce385ec8d1b3580c0c8b4bc494a |
|
BLAKE2b-256 | 1e6c2cc31c4d165f53f6137e76ccaed381705f2d2ba158d24f5ff638835db21e |
Hashes for langchain_graphrag-0.0.2b6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ac2be3a4bbef3db0538c5293784eaca87bdb301321110e9e8c80c77855882aa4 |
|
MD5 | 2f47ff7f652da06352456fe1042ad472 |
|
BLAKE2b-256 | 64de7607fa25e32362fa228dfa525e31e3acaa40eda154329d0320508de3f2d8 |