Skip to main content

A package for summarizing RDF graphs for Question Answering pipelines

Project description

Gilgamesh Summarization Tool

Overview

Gilgamehs is an LLM-based ontology summarizaton. Developed as a pipeline for optimization in Knowledge Graph Question Answering tasks, to reduce a KG's complexity and prune possible multi-hop questions, further increasing the QA system accuracy. This pipeline utilizes the capabilities of LLMs to create concise summaries and locate possible redundant key-value patterns in the target KG. Implemented as a PyPI package, this pipeline can be deployed to summarize knowledge graphs as an optimization for question answering tasks.

Install

Basic requirements:

  • Python version greater or equal to 3.10.

    pip install gilgamesh-summarizer
    

🔍 Basic Usage: Creating summaries based on Key-Value Pairs

Currently, the major functionality of our tool is in creating ontology summaries through discovering and condensing key-value pair formations in Knowledge Graph ontologies.

Our pipline initially:

  1. Parses ontology file and knowledge graph data
  2. Create clusters from initial knowledge graph data
  3. Cluster numbers can be reduced by removing nodes with high degrees
  4. Clusters can be further reduced in size by spliting and re-clustering large clusers (powered by PyJedAI)
from gilgamesh_summarizer.KnowledgeGraph import KnowledgeGraph

kg = KnowledgeGraph(path_to_rdf_data,path_to_ontology)

clusters, triples_dict = kg.create_clusters(prune_top_nodes=16,max_cluster_size=200)
clusters

And provides an unsloth based fine-tuned model locates meaningful information that can be used to create summaries

from gilgamesh_summarizer.Summarizer import Summarizer

classifier = Summarizer(kg)
results = classifier.classify_clusters(clusters, triples_dict)

Notebook Demo

An end to end example of the tool's summarization pipeline is presented here: Notebook

Github Repository

The packages official Github repo

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gilgamesh_summarizer-0.2.7.tar.gz (13.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gilgamesh_summarizer-0.2.7-py3-none-any.whl (13.5 kB view details)

Uploaded Python 3

File details

Details for the file gilgamesh_summarizer-0.2.7.tar.gz.

File metadata

  • Download URL: gilgamesh_summarizer-0.2.7.tar.gz
  • Upload date:
  • Size: 13.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for gilgamesh_summarizer-0.2.7.tar.gz
Algorithm Hash digest
SHA256 24ec139c5c0f2291834f371ab83eba8dc5f42e0316dbde175816d9c0700b3893
MD5 d09b323faecbc67a7b0c4825c9866db2
BLAKE2b-256 648da2135b456b53f4284459bf9578fbd04e51c8f40307fde4f55a06cbc52a04

See more details on using hashes here.

File details

Details for the file gilgamesh_summarizer-0.2.7-py3-none-any.whl.

File metadata

File hashes

Hashes for gilgamesh_summarizer-0.2.7-py3-none-any.whl
Algorithm Hash digest
SHA256 6f1a6d6cc5e55e873c9ee42936701b6d4a41c4f4ed99842d2d2a7ba8de25e321
MD5 cb5c3c6cb04fdf41c3d125b576787e7a
BLAKE2b-256 48b23c23c9c67fbc1a0af66bab57e7af1373a7d3e3b66fef21c7fdf3bf24b587

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page