Extract a knowledge graph using LLMs from any text or messages array
Project description
kg-gen: Knowledge Graph Generation from Any Text
Welcome! kg-gen helps you generate knowledge graphs from any source text using AI. It can process both small and large text inputs, and it can also handle messages in a conversation format.
Why generate knowledge graphs? kg-gen is great if you want to:
- Create a graph to assist with RAG (Retrieval-Augmented Generation)
- Create graph synthetic data for model training and testing
- Structure any text into a graph
- Analyze the relationships between concepts in your source text
We support all model providers supported by LiteLLM. We also use DSPy for structured output generation.
Quick start
Install the module:
pip install kg-gen
Then import and use kg-gen. You can provide your text input in one of two formats:
- A single string
- A list of Message objects (each with a role and content)
Below are some example snippets:
from kg_gen import KGGen
# Initialize the KGGen
kg = KGGen()
# EXAMPLE 1: Single string with model
text_input = "Linda is Josh's mother. Ben is Josh's brother. Andrew is Josh's father. Judy is Andrew's sister. Josh is Judy's nephew. Judy is Josh's aunt."
graph_1 = kg.generate(
input_data=text_input,
model="openai/gpt-4o"
api_key="<OPENAI_API_KEY>" # Optional if this is set in your environment
)
# Output:
# entities={'Linda', 'Judy', 'Ben', 'Andrew', 'Josh'}
# edges={'is sister of', 'is father of', 'is aunt of', 'is brother of',
# 'is mother of', 'is nephew of'}
# relations={('Judy', 'is aunt of', 'Josh'), ('Josh', 'is nephew of', 'Judy'),
# ('Andrew', 'is father of', 'Josh'), ('Ben', 'is brother of', 'Josh'),
# ('Judy', 'is sister of', 'Andrew'), ('Linda', 'is mother of', 'Josh')}
# EXAMPLE 2: Messages array with role filtering
messages = [
{"role": "user", "content": "What is the capital of France?"},
{"role": "assistant", "content": "The capital of France is Paris."}
]
graph_3 = kg.generate(
input_data=messages,
model="openai/gpt-4o-mini"
)
# Output:
# entities={'Paris', 'France'}
# edges={'has capital'}
# relations={('France', 'has capital', 'Paris')}
Message Array Processing
When processing message arrays, kg-gen:
- Preserves the role information from each message
- Maintains message order and boundaries
- Can extract entities and relationships:
- Between concepts mentioned in messages
- Between speakers (roles) and concepts
- Across multiple messages in a conversation
For example, given this conversation:
messages = [
{"role": "user", "content": "What is the capital of France?"},
{"role": "assistant", "content": "The capital of France is Paris."}
]
The generated graph might include entities like:
- "user"
- "assistant"
- "France"
- "Paris"
And relations like:
- (user, "asks about", "France")
- (assistant, "states", "Paris")
- (Paris, "is capital of", "France")
License
The MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kg_gen-0.1.0.tar.gz.
File metadata
- Download URL: kg_gen-0.1.0.tar.gz
- Upload date:
- Size: 7.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe88ddb5ef9ac4da0849deaf2f386cb0510e9486b7455172cb43b4ea435a588b
|
|
| MD5 |
ed5ec2497026a9213959ec0aa3595219
|
|
| BLAKE2b-256 |
2491ed00514819375bfbe4f478f8871b6068323ff79ad2e0957c20e298809545
|
File details
Details for the file kg_gen-0.1.0-py3-none-any.whl.
File metadata
- Download URL: kg_gen-0.1.0-py3-none-any.whl
- Upload date:
- Size: 5.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
13d37cbe8e0b326f0df23fd1eb642f488d97263c0051c85db4434a7aac4f276b
|
|
| MD5 |
a2903abfa929910585ad61aa6413a701
|
|
| BLAKE2b-256 |
299557076df8fec1f80cc750b3a3f604bad6af9bdadfba849f9cb668fd4f49b7
|