Skip to main content

A Python package to parse text into a knowledge graph using LLMs.

Project description

Knowledge Graph Parser

Python 3.10
MIT License
Project Status - Alpha
Code Style

Overview

kg-parser is a Python package and CLI application that extracts structured knowledge graph triples from unstructured text using large language models (LLMs). The package supports multiple backends (HuggingFace, OpenAI, and local Jan) and offers flexible output formats—either as arrays of strings or as dictionaries.

Features

  • Multi-backend Support: Use HuggingFace, OpenAI, or a local Jan server.
  • Batch Processing: Process multiple texts efficiently.
  • Flexible Output: Choose between list or dict representations for triples.
  • CLI & API: Easily run as a command-line tool or integrate into your Python projects.

Installation

Using Conda

Create and activate the conda environment with the provided configuration:

conda env create -f environment.yml
conda activate kg-parser

Using Pip

Install the package directly from PyPI:

pip install kg-parser

Install the minimal dependencies from requirements.txt:

pip install -r requirements.txt

Alternatively, install the package in editable mode:

pip install -e .

Usage

As a CLI Application

Run the CLI tool from the command line:

python -m kg_parser.cli \
  --model-type huggingface \
  --model-name-or-path "google/flan-t5-small" \
  --input-file test_input.json \
  --output-file output_kg.json \
  --triple-format list

Arguments:

  • --model-type: Choose from huggingface, openai, or jan_local.
  • --model-name-or-path: Specify the model name or path.
  • --input-file: Path to a JSON file containing an array of text strings.
  • --output-file: Path where the output JSON will be saved.
  • --triple-format: Output format for triples (list for arrays or dict for dictionaries).

As a Python Package

Import and use kg-parser in your own Python scripts:

from kg_parser.config import ModelConfig, ModelType
from kg_parser.core import KGParser

# Configure the model
model_config = ModelConfig(
    model_type=ModelType.HUGGINGFACE,
    model_name_or_path="google/flan-t5-small"
)
parser = KGParser(model_config)

# Process texts
texts = [
    "Mount Everest is the highest mountain in the world. It's located in Nepal."
]
results = parser.parse_batch(texts)

# Save results with triples as lists
parser.save_to_json(results, "output_kg.json", triple_format="list")

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgements

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kg_parser-0.1.1.tar.gz (11.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kg_parser-0.1.1-py3-none-any.whl (11.5 kB view details)

Uploaded Python 3

File details

Details for the file kg_parser-0.1.1.tar.gz.

File metadata

  • Download URL: kg_parser-0.1.1.tar.gz
  • Upload date:
  • Size: 11.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for kg_parser-0.1.1.tar.gz
Algorithm Hash digest
SHA256 1a178fbf5ac5febc00c24daf7e3625f5f725bce0475aadab9c24d7ed523db873
MD5 0850c612db92b7bba2f00a0bee8c3b47
BLAKE2b-256 3a46ee4fbdbcbe4afc694ee6b5408c6764d92a2ff99d6c445ea2903283793fe9

See more details on using hashes here.

File details

Details for the file kg_parser-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: kg_parser-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 11.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for kg_parser-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 927584b40906718dc0abedb38ad29aa1d4d6d7aa9031506f263060706b13be7b
MD5 7886f6cf1b92d4b563a7dd9dcd53718f
BLAKE2b-256 83d55f882ee34e599811210fd1f8c87713e40c228655dd176d0d2726134042d8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page