Skip to main content

Extract a knowledge graph using LLMs from any text or messages array

Project description

kg-gen: Knowledge Graph Generation from Any Text

Welcome! kg-gen helps you generate knowledge graphs from any source text using AI. It can process both small and large text inputs, and it can also handle messages in a conversation format.

Why generate knowledge graphs? kg-gen is great if you want to:

  • Create a graph to assist with RAG (Retrieval-Augmented Generation)
  • Create graph synthetic data for model training and testing
  • Structure any text into a graph
  • Analyze the relationships between concepts in your source text

We support all model providers supported by LiteLLM. We also use DSPy for structured output generation.

Quick start

Install the module:

pip install kg-gen

Then import and use kg-gen. You can provide your text input in one of two formats:

  1. A single string
  2. A list of Message objects (each with a role and content)

Below are some example snippets:

from kg_gen import KGGen

# Initialize the KGGen
kg = KGGen()

# EXAMPLE 1: Single string with model
text_input = "Linda is Josh's mother. Ben is Josh's brother. Andrew is Josh's father. Judy is Andrew's sister. Josh is Judy's nephew. Judy is Josh's aunt."
graph_1 = kg.generate(
  input_data=text_input,
  model="openai/gpt-4o"
  api_key="<OPENAI_API_KEY>" # Optional if this is set in your environment
)
# Output: 
# entities={'Linda', 'Judy', 'Ben', 'Andrew', 'Josh'} 
# edges={'is sister of', 'is father of', 'is aunt of', 'is brother of', 
# 'is mother of', 'is nephew of'} 
# relations={('Judy', 'is aunt of', 'Josh'), ('Josh', 'is nephew of', 'Judy'), 
# ('Andrew', 'is father of', 'Josh'), ('Ben', 'is brother of', 'Josh'), 
# ('Judy', 'is sister of', 'Andrew'), ('Linda', 'is mother of', 'Josh')}

# EXAMPLE 2: Messages array with role filtering
messages = [
  {"role": "user", "content": "What is the capital of France?"}, 
  {"role": "assistant", "content": "The capital of France is Paris."}
]
graph_3 = kg.generate(
  input_data=messages,
  model="openai/gpt-4o-mini"
)
# Output: 
# entities={'Paris', 'France'} 
# edges={'has capital'} 
# relations={('France', 'has capital', 'Paris')}

Message Array Processing

When processing message arrays, kg-gen:

  1. Preserves the role information from each message
  2. Maintains message order and boundaries
  3. Can extract entities and relationships:
    • Between concepts mentioned in messages
    • Between speakers (roles) and concepts
    • Across multiple messages in a conversation

For example, given this conversation:

messages = [
  {"role": "user", "content": "What is the capital of France?"},
  {"role": "assistant", "content": "The capital of France is Paris."}
]

The generated graph might include entities like:

  • "user"
  • "assistant"
  • "France"
  • "Paris"

And relations like:

  • (user, "asks about", "France")
  • (assistant, "states", "Paris")
  • (Paris, "is capital of", "France")

License

The MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kg_gen-0.1.1.tar.gz (6.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kg_gen-0.1.1-py3-none-any.whl (5.8 kB view details)

Uploaded Python 3

File details

Details for the file kg_gen-0.1.1.tar.gz.

File metadata

  • Download URL: kg_gen-0.1.1.tar.gz
  • Upload date:
  • Size: 6.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.7

File hashes

Hashes for kg_gen-0.1.1.tar.gz
Algorithm Hash digest
SHA256 d24d193f934c2f10526ccda1f49253ba6e0c002ae5436ee8da3162edf6ea4fc9
MD5 96af651199633a21f2105ce2d1223927
BLAKE2b-256 5795a814f48cb6216bd6fa5225917bf6906ad09be3130059cf7a4f45ab44e12d

See more details on using hashes here.

File details

Details for the file kg_gen-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: kg_gen-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 5.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.7

File hashes

Hashes for kg_gen-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f3a1df0cf1f205950b04be230be0b7dd218b2a768d71139f31896892ef51a4a2
MD5 43eb5955da5467166bcb0cb325649bb2
BLAKE2b-256 6ce443ad29840886da20ac753c1b7aff92d3f8c1790031665a43557a99d532d7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page