Skip to main content

GraphRAG-driven AI Ontology Generation Tool

Project description

GraphragKM - AI Ontology Generation Tool Driven by GraphRAG

GraphragKM is an AI ontology generation tool based on GraphRAG that can automatically extract knowledge from PDF documents and generate OWL ontologies and UML models. It integrates text extraction, OCR recognition, graph construction, inference, and other technologies to provide users with a one-stop knowledge graph and ontology generation solution.

Features

  • PDF Document Processing and Text Extraction: Supports extracting text from PDF documents to obtain key information.
  • Image OCR Recognition: Supports text extraction from images, helping recognize content from scanned documents or pictures.
  • GraphRAG-Based Knowledge Graph Construction: Automatically constructs knowledge graphs and visualizes knowledge in graph form.
  • Entity and Relationship Inference: Infers entities and their relationships from extracted text and images to build a more complete knowledge graph.
  • Automatic OWL Ontology Generation: Automatically constructs OWL ontology from extracted information, supporting semantic reasoning and knowledge storage.
  • Automatic StarUML Class Diagram Generation: Converts ontology structures into UML class diagrams for easy visualization and editing.

Installation

pip install GraphragKM

Usage

Command Line Usage

# Interactive run
graphragkm run

# Specify input file
graphragkm run -i input.pdf

Generated Files

After execution, the program will generate the following files in the output folder in the current directory:

  • ontology.owl: The generated OWL ontology file.
  • uml_model.puml: The UML class diagram file (StarUML format).

Configuration

On the first run, the program will create a config.yaml configuration file template in the current directory. You need to edit this file and fill in the correct API keys and other configuration information.

api:
  # Mineru API settings
  mineru_upload_url: "https://mineru.net/api/v4/file-urls/batch"
  mineru_results_url_template: "https://mineru.net/api/v4/extract-results/batch/{}"
  mineru_token: "YOUR_MINERU_TOKEN"

  # Chat model settings
  chat_model_api_key: "YOUR_CHAT_MODEL_API_KEY"
  chat_model_api_base: "https://api.deepseek.com"
  chat_model_name: "deepseek-chat"

  # Embedding model settings
  embedding_model_api_key: "YOUR_EMBEDDING_MODEL_API_KEY"
  embedding_model_api_base: "https://open.bigmodel.cn/api/paas/v4/"
  embedding_model_name: "embedding-3"

app:
  # OWL Namespace
  owl_namespace: "https://example.com/"

  # Maximum concurrent requests
  max_concurrent_requests: 25

Dependencies

  • Python 3.11+
  • graphrag
  • easyocr
  • openai
  • pandas
  • rdflib
  • rich
  • click
  • scikit-learn
  • For a full list of dependencies, see pyproject.toml

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphragkm-0.2.0.tar.gz (27.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graphragkm-0.2.0-py3-none-any.whl (31.8 kB view details)

Uploaded Python 3

File details

Details for the file graphragkm-0.2.0.tar.gz.

File metadata

  • Download URL: graphragkm-0.2.0.tar.gz
  • Upload date:
  • Size: 27.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.13.5 Darwin/24.5.0

File hashes

Hashes for graphragkm-0.2.0.tar.gz
Algorithm Hash digest
SHA256 ba9f55e7c3321c91c0b0cac46912299c079de593185a4072185af60a84732810
MD5 09e6ef89e25480853241765f116031d8
BLAKE2b-256 a748eba77723058ae1f0a2413ab1666af67a298bc188326b4216df2f6d20ec8b

See more details on using hashes here.

File details

Details for the file graphragkm-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: graphragkm-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 31.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.13.5 Darwin/24.5.0

File hashes

Hashes for graphragkm-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e5244c325ea5d9ec4be730674e83882c506bcad3d3fff6e982bedbeffae82998
MD5 838ffdd71db6324ba790c06a1d7e3e36
BLAKE2b-256 3e71ba88e9c5a595e2111b95c7c3b28ff0f01fb334174968e1a7d1aecea2f4fa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page