A package for training and evaluating multimodal knowledge graph embeddings

These details have not been verified by PyPI

Project links

Project description

PyKEEN

PyKEEN (Python KnowlEdge EmbeddiNgs) is a Python package designed to train and evaluate knowledge graph embedding models (incorporating multi-modal information). It is part of the KEEN Universe.

Installation • Quickstart • Datasets • Models • Support

Installation

The development version of PyKEEN can be downloaded and installed from PyPI on Python 3.7+ with:

$ pip install pykeen

The development version of PyKEEN can be downloaded and installed from GitHub on Python 3.7+ with:

$ git clone https://github.com/pykeen/pykeeen.git pykeen
$ cd pykeen
$ pip install -e .
$ # Install pre-commit
$ pip install pre-commit
$ pre-commit install

PyKEEN has several extras for installation that are defined in the [options.extras_require] section of the setup.cfg. They can be included with installation using the bracket notation like in pip install pykeen[docs] or pip install -e .[docs]. Several can be listed, comma-delimited like in pip install pykeen[docs,plotting].

Name	Description
`plotting`	Plotting with `seaborn` and generation of word clouds
`mlflow`	Tracking of results with `mlflow`
`docs`	Building of the documentation
`templating`	Building of templated documentation, like the README

Contributing

Contributions, whether filing an issue, making a pull request, or forking, are appreciated. See CONTRIBUTING.md for more information on getting involved.

Quickstart

This example shows how to train a model on a data set and test on another data set.

The fastest way to get up and running is to use the pipeline function. It provides a high-level entry into the extensible functionality of this package. The following example shows how to train and evaluate the TransE model on the Nations dataset. By default, the training loop uses the stochastic local closed world assumption (sLCWA) training approach and evaluates with rank-based evaluation.

from pykeen.pipeline import pipeline
result = pipeline(
    model='TransE',
    dataset='nations',
)

The results are returned in a dataclass that has attributes for the trained model, the training loop, and the evaluation.

PyKEEN is extensible such that:

Each model has the same API, so anything from pykeen.models can be dropped in
Each training loop has the same API, so pykeen.training.LCWATrainingLoop can be dropped in
Triples factories can be generated by the user with from pykeen.triples.TriplesFactory

Implementation

Below are the models, data sets, training modes, evaluators, and metrics implemented in pykeen.

Datasets (13)

Name	Reference	Description
fb15k	`pykeen.datasets.FB15k`	The FB15k data set.
fb15k237	`pykeen.datasets.FB15k237`	The FB15k-237 data set.
hetionet	`pykeen.datasets.Hetionet`	The Hetionet dataset is a large biological network.
kinships	`pykeen.datasets.Kinships`	The Kinships data set.
nations	`pykeen.datasets.Nations`	The Nations data set.
openbiolink	`pykeen.datasets.OpenBioLink`	The OpenBioLink dataset.
openbiolinkf1	`pykeen.datasets.OpenBioLinkF1`	The PyKEEN First Filtered OpenBioLink 2020 Dataset.
openbiolinkf2	`pykeen.datasets.OpenBioLinkF2`	The PyKEEN Second Filtered OpenBioLink 2020 Dataset.
openbiolinklq	`pykeen.datasets.OpenBioLinkLQ`	The low-quality variant of the OpenBioLink dataset.
umls	`pykeen.datasets.UMLS`	The UMLS data set.
wn18	`pykeen.datasets.WN18`	The WN18 data set.
wn18rr	`pykeen.datasets.WN18RR`	The WN18-RR data set.
yago310	`pykeen.datasets.YAGO310`	The YAGO3-10 data set is a subset of YAGO3 that only contains entities with at least 10 relations.

Models (23)

Name	Reference	Citation
ComplEx	`pykeen.models.ComplEx`	Trouillon et al., 2016
ComplExLiteral	`pykeen.models.ComplExLiteral`	Agustinus et al., 2018
ConvE	`pykeen.models.ConvE`	Dettmers et al., 2018
ConvKB	`pykeen.models.ConvKB`	Nguyen et al., 2018
DistMult	`pykeen.models.DistMult`	Yang et al., 2014
DistMultLiteral	`pykeen.models.DistMultLiteral`	Agustinus et al., 2018
ERMLP	`pykeen.models.ERMLP`	Dong et al., 2014
ERMLPE	`pykeen.models.ERMLPE`	Sharifzadeh et al., 2019
HolE	`pykeen.models.HolE`	Nickel et al., 2016
KG2E	`pykeen.models.KG2E`	He et al., 2015
NTN	`pykeen.models.NTN`	Socher et al., 2013
ProjE	`pykeen.models.ProjE`	Shi et al., 2017
RESCAL	`pykeen.models.RESCAL`	Nickel et al., 2011
RGCN	`pykeen.models.RGCN`	Schlichtkrull et al., 2018
RotatE	`pykeen.models.RotatE`	Sun et al., 2019
SimplE	`pykeen.models.SimplE`	Kazemi et al., 2018
StructuredEmbedding	`pykeen.models.StructuredEmbedding`	Bordes et al., 2011
TransD	`pykeen.models.TransD`	Ji et al., 2015
TransE	`pykeen.models.TransE`	Bordes et al., 2013
TransH	`pykeen.models.TransH`	Wang et al., 2014
TransR	`pykeen.models.TransR`	Lin et al., 2015
TuckER	`pykeen.models.TuckER`	Balazevic et al., 2019
UnstructuredModel	`pykeen.models.UnstructuredModel`	Bordes et al., 2014

Losses (7)

Name	Reference	Description
bce	`pykeen.losses.BCELoss`	A wrapper around the PyTorch binary cross entropy loss.
bceaftersigmoid	`pykeen.losses.BCEAfterSigmoidLoss`	A loss function which uses the numerically unstable version of explicit Sigmoid + BCE.
crossentropy	`pykeen.losses.CrossEntropyLoss`	Evaluate cross entropy after softmax output.
marginranking	`pykeen.losses.MarginRankingLoss`	A wrapper around the PyTorch margin ranking loss.
mse	`pykeen.losses.MSELoss`	A wrapper around the PyTorch mean square error loss.
nssa	`pykeen.losses.NSSALoss`	An implementation of the self-adversarial negative sampling loss function proposed by [sun2019]_.
softplus	`pykeen.losses.SoftplusLoss`	A loss function for the softplus.

Regularizers (5)

Name	Reference	Description
combined	`pykeen.regularizers.CombinedRegularizer`	A convex combination of regularizers.
lp	`pykeen.regularizers.LpRegularizer`	A simple L_p norm based regularizer.
no	`pykeen.regularizers.NoRegularizer`	A regularizer which does not perform any regularization.
powersum	`pykeen.regularizers.PowerSumRegularizer`	A simple x^p based regularizer.
transh	`pykeen.regularizers.TransHRegularizer`	A regularizer for the soft constraints in TransH.

Optimizers (6)

Name	Reference	Description
adadelta	`torch.optim.Adadelta`	Implements Adadelta algorithm.
adagrad	`torch.optim.Adagrad`	Implements Adagrad algorithm.
adam	`torch.optim.Adam`	Implements Adam algorithm.
adamax	`torch.optim.Adamax`	Implements Adamax algorithm (a variant of Adam based on infinity norm).
adamw	`torch.optim.AdamW`	Implements AdamW algorithm.
sgd	`torch.optim.SGD`	Implements stochastic gradient descent (optionally with momentum).

Training Loops (2)

Name	Reference	Description
lcwa	`pykeen.training.LCWATrainingLoop`	A training loop that uses the local closed world assumption training approach.
slcwa	`pykeen.training.SLCWATrainingLoop`	A training loop that uses the stochastic local closed world assumption training approach.

Negative Samplers (2)

Name	Reference	Description
basic	`pykeen.sampling.BasicNegativeSampler`	A basic negative sampler.
bernoulli	`pykeen.sampling.BernoulliNegativeSampler`	An implementation of the bernoulli negative sampling approach proposed by [wang2014]_.

Stoppers (2)

Name	Reference	Description
early	`pykeen.stoppers.EarlyStopper`	A harness for early stopping.
nop	`pykeen.stoppers.NopStopper`	A stopper that does nothing.

Evaluators (2)

Name	Reference	Description
rankbased	`pykeen.evaluation.RankBasedEvaluator`	A rank-based evaluator for KGE models.
sklearn	`pykeen.evaluation.SklearnEvaluator`	An evaluator that uses a Scikit-learn metric.

Metrics (6)

Metric	Description	Evaluator	Reference
Adjusted Mean Rank	The mean over all chance-adjusted ranks: mean_i (2r_i / (num_entities+1)). Lower is better.	rankbased	`pykeen.evaluation.RankBasedMetricResults`
Average Precision Score	The area under the precision-recall curve, between [0.0, 1.0]. Higher is better.	sklearn	`pykeen.evaluation.SklearnMetricResults`
Hits At K	The hits at k for different values of k, i.e. the relative frequency of ranks not larger than k. Higher is better.	rankbased	`pykeen.evaluation.RankBasedMetricResults`
Mean Rank	The mean over all ranks: mean_i r_i. Lower is better.	rankbased	`pykeen.evaluation.RankBasedMetricResults`
Mean Reciprocal Rank	The mean over all reciprocal ranks: mean_i (1/r_i). Higher is better.	rankbased	`pykeen.evaluation.RankBasedMetricResults`
Roc Auc Score	The area under the ROC curve between [0.0, 1.0]. Higher is better.	sklearn	`pykeen.evaluation.SklearnMetricResults`

Hyper-parameter Optimization

Samplers (2)

Name	Reference	Description
random	`optuna.samplers.RandomSampler`	Sampler using random sampling.
tpe	`optuna.samplers.TPESampler`	Sampler using TPE (Tree-structured Parzen Estimator) algorithm.

Experimentation

Reproduction

PyKEEN includes a set of curated experimental settings for reproducing past landmark experiments. They can be accessed and run like:

pykeen experiments reproduce tucker balazevic2019 fb15k

Where the three arguments are the model name, the reference, and the data set. The output directory can be optionally set with -d.

Ablation

PyKEEN includes the ability to specify ablation studies using the hyper-parameter optimization module. They can be run like:

pykeen experiments ablation ~/path/to/config.json

Acknowledgements

Supporters

This project has been supported by several organizations (in alphabetical order):

Logo

The PyKEEN logo was designed by Carina Steinborn.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.11.1

Apr 24, 2025

1.11.0

Oct 29, 2024

1.10.3.dev0 pre-release

Oct 29, 2024

1.10.2

Feb 19, 2024

1.10.1

Feb 22, 2023

1.10.0

Jan 31, 2023

1.10.0.dev0 pre-release

Jan 31, 2023

1.9.0

Aug 4, 2022

1.8.2

May 24, 2022

1.8.1

Apr 20, 2022

1.8.0

Mar 22, 2022

1.7.0

Jan 11, 2022

1.6.0

Oct 18, 2021

1.5.0

Jun 13, 2021

1.4.0

Mar 4, 2021

1.3.0

Feb 15, 2021

1.2.0

Feb 12, 2021

1.1.0

Jan 20, 2021

1.0.5

Oct 21, 2020

1.0.4

Aug 25, 2020

1.0.3

Aug 13, 2020

This version

1.0.2

Jul 10, 2020

1.0.1

Jul 2, 2020

1.0.0

Jun 25, 2020

0.0.26

Aug 13, 2019

0.0.25

Apr 11, 2019

0.0.24

Apr 11, 2019

0.0.23

Apr 4, 2019

0.0.22

Apr 2, 2019

0.0.21

Apr 1, 2019

0.0.20

Apr 1, 2019

0.0.20.dev0 pre-release

Apr 1, 2019

0.0.19

Jan 30, 2019

0.0.18

Jan 18, 2019

0.0.17

Jan 18, 2019

0.0.16

Dec 23, 2018

0.0.15

Dec 12, 2018

0.0.14

Nov 26, 2018

0.0.13

Nov 21, 2018

0.0.12

Nov 19, 2018

0.0.11

Nov 19, 2018

0.0.10

Nov 7, 2018

0.0.8

Oct 28, 2018

0.0.7

Oct 23, 2018

0.0.6

Oct 18, 2018

0.0.5

Oct 18, 2018

0.0.4

Oct 17, 2018

0.0.3

Oct 12, 2018

0.0.2

Oct 10, 2018

0.0.1

Oct 9, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pykeen-1.0.2.tar.gz (692.3 kB view details)

Uploaded Jul 10, 2020 Source

Built Distribution

pykeen-1.0.2-py3-none-any.whl (306.0 kB view details)

Uploaded Jul 10, 2020 Python 3

File details

Details for the file pykeen-1.0.2.tar.gz.

File metadata

Download URL: pykeen-1.0.2.tar.gz
Upload date: Jul 10, 2020
Size: 692.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.3

File hashes

Hashes for pykeen-1.0.2.tar.gz
Algorithm	Hash digest
SHA256	`b6ecf983a3598d8628d969cc47f104c7030bb593a646683d2230d2d8c8f67d87`
MD5	`8cc82d66b077b6702018629031d9fbb0`
BLAKE2b-256	`c0ceb8d5a104167e67d49daf556a153bb582924bd564135de309d4ae919a953f`

See more details on using hashes here.

File details

Details for the file pykeen-1.0.2-py3-none-any.whl.

File metadata

Download URL: pykeen-1.0.2-py3-none-any.whl
Upload date: Jul 10, 2020
Size: 306.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.3

File hashes

Hashes for pykeen-1.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`129542b9fa90db747f7176bb2f4a21899429dd1ec19c2343d19a68230512328e`
MD5	`111d8b5bf279447c32f03a2005c9f396`
BLAKE2b-256	`6548390602d657e92ec2b018d8000718b68bf766606ef8e0d992ad8c6cc4fe3f`

See more details on using hashes here.

pykeen 1.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

PyKEEN

Installation

Contributing

Quickstart

Implementation

Datasets (13)

Models (23)

Losses (7)

Regularizers (5)

Optimizers (6)

Training Loops (2)

Negative Samplers (2)

Stoppers (2)

Evaluators (2)

Metrics (6)

Hyper-parameter Optimization

Samplers (2)

Experimentation

Reproduction

Ablation

Acknowledgements

Supporters

Logo

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes