Skip to main content

An Open Source Library for uncertain Knowledge Reasoning

Project description

unKR: An Open Source Toolkit for Uncertain Knowledge Graph Representation Learning

Pypi Pypi Documentation

English | 中文

unKR is an open source toolkit for Uncertain Knowledge Graph Representation Learning(UKRL). It is based on the PyTorch Lightning framework to decouple the workflow of the UKRL models in order to implement multiple Uncertain Knowledge Graph Embedding(UKGE) methods, which in turn assist knowledge graph complementation, inference and other tasks. The tool provides code implementations and results of various existing UKGE models, and provides users with detailed technical documentation.

🔖 Overview

(图片待修改)

unKR toolkit is an efficient implementation for Uncertain Knowledge Graph Representation Learning(URKL) based on the PyTorch Lightning framework. It provides a refinement module process that can implement a variety of Uncertain Knowledge Graph Embedding(UKGE) models, including UKG data preprocessing(Sampler for negative sampling), model implementation base module, and model training, validation, and testing modules. These modules are widely used in different UKGE models, facilitating users to quickly construct their own models.

There are nine different models available, divided according to whether they are few-shot models or not. unKR has validated the tool on three datasets with seven different evaluation metrics, and the details of the models will be discussed in the following sections.

unKR core development team will provide long-term technical support for the toolkit, and developers are welcome to discuss the work and initiate questions using issue.

Detailed documentation of the unKR technology and results is available at 📋.


📝 Models

unKR implements nine UKGE methods that partition the model based on whether it is a few-shot model or not. The available models are as below.

Category Model
Normal model BEURrE, FocusE, GTransE, PASSLEAF, UKGE, UKGsE, UPGAT
Few-shot model GMUC, GMUC+

Datasets

unKR provides three different sources of UKG datasets including CN15K, NL27K, and PPI5K. The following table respectively shows the source of the datasets and the number of entities, relationships, and quaternions they contain.

Dataset Source Entities Relations Quaternions
CN15K ConceptNet 15000 36 241158
NL27K NELL 27221 404 175412
PPI5K STRING 4999 7 271666

Organize the three datasets, here are the three data files common to all models.

train.tsv: All triples used for training and corresponding confidence scores in the format(ent1, rel, ent2, score), one quaternion per line.

val.tsv: All triples used for validation and corresponding confidence scores in the format(ent1, rel, ent2, score), one quaternion per line.

test.tsv: All triples used for testing and corresponding confidence scores in the format(ent1, rel, ent2, score), one quaternion per line.

In UKGE, thesoftlogic.tsvfile is also required.

softlogic.tsv: All triples inferred by PSL rule and corresponding inferred confidence scores in the format(ent1, rel, ent2, score), one quaternion per line.

In GMUC, GMUC+, the following five data files are also needed.

train/dev/test_tasks.json: Few-shot dataset with one task per relation in the format{rel:[[ent1, rel, ent2, score], ...]}. The key of the dictionary is the task name and the values are all the quaternions under the task.

path_graph: All data except training, validation and testing tasks, i.e. background knowledge, in the format(ent1, rel, ent2, score). Each line represents a quaternion.

ontology.csv: Ontology knowledge data required for the GMUC+ model, in the format(number, h, rel, t), one ontology knowledge per line. There are four types of rel, which includes is_A, domain, range, and type.

  • c1 is_A c2: c1 is a subclass of c2;
  • c1 domain c2: the definition domain of c1 is c2;
  • c1 range c2: the value domain of c1 is c2;
  • c1 type c2: the type of c1 is c2.

Reproduced Results

unKR uses confidence prediction and link prediction tasks for model evaluation in seven different metrics, MSE, MAE(confidence prediction), Hits@k(k=1,3,10), MRR, MR, WMRR, and WMR(link prediction), with raw and filter settings. In addition, unKR adopts a high-confidence filter(set the filter value to 0.7) method for the evaluation.

Here are the reproduced model results on NL27K dataset using unKR as below. See more results in here.

Confidence prediction

Category Model MSE MAE
Normal model BEUrRE 0.08920 0.22194
PASSLEAF_ComplEx 0.02434 0.05176
PASSLEAF_DistMult 0.02309 0.05107
PASSLEAF_RotatE 0.01949 0.06253
UKGElogi 0.02861 0.05967
UKGElogiPSL 0.02868 0.05966
UKGErect 0.03344 0.07052
UKGErectPSL 0.03326 0.07015
UKGsE 0.12202 0.27065
UPGAT 0.02922 0.10107
Few-shot model GMUC 0.01300 0.08200
GMUC+ 0.01300 0.08600

Link prediction (filter on high-confidence test data)

Category Model Hits@1 Hits@3 Hits@10 MRR MR WMRR WMR
Normal model BEUrRE 0.156 0.385 0.543 0.299 488.051 0.306 471.784
FocusE 0.814 0.918 0.957 0.870 384.471 0.871 379.761
GTransE 0.222 0.366 0.493 0.316 1377.564 0.319 1378.505
PASSLEAF_ComplEx 0.669 0.786 0.876 0.741 138.808 0.753 138.477
PASSLEAF_DistMult 0.627 0.754 0.856 0.707 138.781 0.717 137.864
PASSLEAF_RotatE 0.687 0.816 0.884 0.762 50.776 0.774 50.194
UKGElogi 0.526 0.670 0.805 0.622 153.632 0.630 152.314
UKGElogiPSL 0.525 0.673 0.812 0.623 168.029 0.632 167.344
UKGErect 0.509 0.662 0.807 0.609 126.011 0.614 124.424
UKGErectPSL 0.500 0.647 0.800 0.599 125.233 0.604 124.189
UKGsE 0.038 0.073 0.130 0.069 2329.501 0.069 2288.222
UPGAT 0.618 0.751 0.862 0.701 69.120 0.708 69.364
Few-shot model GMUC 0.335 0.465 0.592 0.425 58.312 0.426 58.097
GMUC+ 0.338 0.486 0.636 0.438 45.774 0.438 45.682

🛠️ Deployment

Installation

Step1 Create a virtual environment using Anaconda and enter it.

conda create -n unKR python=3.8
conda activate unKR
pip install -r requirements.txt

Step2 Install package.

  • Install from source
git clone https://github.com/seucoin/unKR.git
cd unKR
python setup.py install
  • Install by pypi
pip install unKR

Step3 Model training.

python main.py

Parameter Adjustment

In the config file, we provide parameter profiles of the reproduced results, and the following parameters can be adjusted for specific use.

parameters:
  confidence_filter:  #whether to perform high-confidence filtering
    values: [0, 0.7]
  emb_dim:
    values: [128, 256, 512...]
  lr:
    values: [1.0e-03, 3.0e-04, 5.0e-06...]
  num_neg:
    values: [1, 10, 20...]
  train_bs:
    values: [64, 128, 256...]

✉️ Citation

If you find unKR is useful for your research, please consider citing the following paper:

@article{
}

😊 unKR Core Team

Southeast University: Jingting Wang, Tianxing Wu, Shilin Chen, Yunchang Liu, Shutong Zhu, Wei Li, Jingyi Xu, Guilin Qi.

🔎 Reference

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unKR-0.9.1.tar.gz (59.3 kB view details)

Uploaded Source

Built Distribution

unKR-0.9.1-py3-none-any.whl (83.2 kB view details)

Uploaded Python 3

File details

Details for the file unKR-0.9.1.tar.gz.

File metadata

  • Download URL: unKR-0.9.1.tar.gz
  • Upload date:
  • Size: 59.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.18

File hashes

Hashes for unKR-0.9.1.tar.gz
Algorithm Hash digest
SHA256 87b1b5b0f1d829e4f9c29a7c1c91c3d6fe4da51a75e303e5047384ead3fab5b2
MD5 ad80b76ccfe2d508f364928565d9cd16
BLAKE2b-256 d2f6cd6f9e1cef038eac100fa68b695bf92578376b68dd3f2c6e2e41b27dfd1b

See more details on using hashes here.

File details

Details for the file unKR-0.9.1-py3-none-any.whl.

File metadata

  • Download URL: unKR-0.9.1-py3-none-any.whl
  • Upload date:
  • Size: 83.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.18

File hashes

Hashes for unKR-0.9.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5c4a54796ed59d3dd7d80cf415def7e985e2a3098400fe667085e1bc82af3cf3
MD5 587e2e1efb9abfc987e0a0cf8f8044be
BLAKE2b-256 57e36d48044eccbfda144c8ec85a3acb0b3c8020c9848c34bfc9a0c73e9e9b82

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page