A powerful tool for transforming documents into graph-based structures using Large Language Models (LLMs).
Project description
🧠 LLMGraphTransformer
LLMGraphTransformer is a Python library designed to extract structured knowledge graphs from unstructured text using LLMs. It allows users to define schemas for nodes and relationships, ensuring that the extracted graph follows a strict format. 🔗📊
🚀 Installation
Install LLMGraphTransformer from PyPI:
pip install LLMGraphTransformer
🛠️ Usage
📥 Importing the Required Modules
from LLMGraphTransformer import LLMGraphTransformer
from LLMGraphTransformer.schema import NodeSchema, RelationshipSchema
from langchain_openai import ChatOpenAI
from langchain_core.documents import Document
from dotenv import load_dotenv
import os
load_dotenv(".env")
🏗️ Defining the Schema
🏷️ Node Schemas
Node schemas define the types of entities that can be extracted from the text. Each node has:
- A type (e.g., "Person", "Organization")
- A list of properties that store additional information (e.g., "name", "birth_year")
- An optional description to describe the node type
📌 Example:
node_schemas = [
NodeSchema("Person", ["name", "birth_year", "death_year", "nationalitie", "profession"], "Represents an individual"),
NodeSchema("Organization", ["name", "founding_year", "industrie"], "Represents a group, company, or institution"),
NodeSchema("Location", ["name"], "Represents a geographical area such as a city, country, or region"),
NodeSchema("Award", ["name", "field"], "Represents an honor, prize, or recognition")
]
🔗 Relationship Schemas
Relationship schemas define the allowed connections between entities. Each relationship has:
- A source node type
- A target node type
- A relationship type
- A list of optional properties (e.g., "year")
📌 Example:
relationship_schemas = [
RelationshipSchema("Person", "SPOUSE_OF", "Person"),
RelationshipSchema("Person", "MEMBER_OF", "Organization", ["start_year", "end_year", "year"]),
RelationshipSchema("Person", "AWARDED", "Award", ["year"]),
RelationshipSchema("Person", "LOCATED_IN", "Location"),
RelationshipSchema("Organization", "LOCATED_IN", "Location")
]
⚙️ Defining Additional Instructions
You can specify additional rules for extraction:
additional_instructions="""- all names must be extracted as uppercase"""
📜 Defining the Input Text
Provide the text from which the knowledge graph should be extracted:
text="""Marie Curie, born in 1867, was a Polish and naturalised-French physicist and chemist who conducted pioneering research on radioactivity.
She was the first woman to win a Nobel Prize, the first person to win a Nobel Prize twice, and the only person to win a Nobel Prize in two scientific fields.
Her husband, Pierre Curie, was a co-winner of her first Nobel Prize, making them the first-ever married couple to win the Nobel Prize and launching the Curie family legacy of five Nobel Prizes.
She was, in 1906, the first woman to become a professor at the University of Paris."""
🤖 Initializing the LLM Model
Use OpenAI's API (or a compatible model) to process the text:
api_key = os.getenv("API_KEY")
base_url = os.getenv("BASE_URL")
model_name = os.getenv("MODEL_NAME")
llm = ChatOpenAI(
api_key=api_key,
base_url=base_url,
model=model_name,
temperature=0,
)
🔄 Initializing the Transformer
Create an instance of LLMGraphTransformer:
llm_transformer = LLMGraphTransformer(
llm=llm,
allowed_nodes=node_schemas,
allowed_relationships=relationship_schemas,
additional_instructions=additional_instructions
)
🔍 Converting Text to a Knowledge Graph
Process the text into a structured knowledge graph:
document = Document(page_content=text)
graph_document = llm_transformer.convert_to_graph_document(document)
print(f"Nodes: {graph_document.nodes}")
print(f"Relationships: {graph_document.relationships}")
📊 Output Format
The extracted knowledge graph will be represented in JSON format with nodes and relationships:
{
"nodes": [
{
"id": "Marie Curie",
"type": "Person",
"properties": {
"name": "Marie Curie",
"birth_year": "1867",
"nationalitie": ["Polish", "French"],
"profession": ["physicist", "chemist"]
}
},
...
],
"relationships": [
{
"source": "Marie Curie",
"target": "Pierre Curie",
"type": "SPOUSE_OF"
},
...
]
}
📜 License
This project is licensed under the MIT License.
🤝 Contributing
Pull requests and feature suggestions are welcome! Open an issue for bug reports or improvements. 🚀
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llmgraphtransformer-0.0.2.tar.gz.
File metadata
- Download URL: llmgraphtransformer-0.0.2.tar.gz
- Upload date:
- Size: 12.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c4bbbccd830b9deb3159dc8bbc744fe7018c4a129572b25e7c51dfa9a0353578
|
|
| MD5 |
731ba085e145aaee89070fd238699759
|
|
| BLAKE2b-256 |
98034e18b9193222c2a619df738a0bc951b9aeded6b102b5cad0c809e845840c
|
File details
Details for the file llmgraphtransformer-0.0.2-py3-none-any.whl.
File metadata
- Download URL: llmgraphtransformer-0.0.2-py3-none-any.whl
- Upload date:
- Size: 12.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9542cca840dd394b725c66356d704409d4511f8db6d0a830f7b831dec8d58fe8
|
|
| MD5 |
edc65049298275ee7cbd0ae9e6b1c06a
|
|
| BLAKE2b-256 |
14942847ef972aa31132e6626bdd28da434e64016b1db9fdb9addec9816e91fb
|