Skip to main content

Python wrapper for knowledge graph abstraction layer

Project description

kglab

The kglab library provides a simple abstraction layer in Python for building and using knowledge graphs.

SPECIAL REQUEST: Which features would you like to see the most in an open source Python library for building and using knowledge graphs? Please add suggestions to this online survey: https://forms.gle/FMHgtmxHYWocprMn6 This will help us prioritize our roadmap for kglab.

Background

For several KG projects, we kept reusing a similar working set of libraries:

Each of these libraries provides a useful piece of the puzzle when you need to leverage knowledge representation, graph algorithms, entity linking, interactive visualization, metadata queries, axioms, etc. However, some of them are relatively low-level (e.g., rdflib) or perhaps not maintained as much (e.g., skosify) and there are challenges integrating them. Challenges we kept having to reinvent work-arounds to resolve.

There are general operations that one must perform on knowledge graphs:

  • building triples
  • quality assurance (e.g., axioms)
  • managing a mix of namespaces
  • serialization to/from multiple formats
  • parallel processing across a cluster
  • interactive visualization
  • queries
  • graph algorithms
  • transitivity and other forms of enriching a graph
  • embedding (deep learning integration)
  • inference (e.g., PSL, Bayesian Networks, Causal, MLN, etc.)
  • other ML integrations

The kglab library provides a reasonably "Pythonic" abstraction layer for these operations on KGs. The class definitions can be subclassed and extended to handle specific needs.

Meanwhile, we're also extending some of the key components with distributed versions, based on ray for better use of horizontal scale-out and parallelization.

NB: this repo is UNDER CONSTRUCTION and will undergo much iteration prior to the "KG 101" tutorial at https://www.knowledgeconnexions.world/talks/kg-101/

See wiki for further details.

Installation

Dependencies:

To install from PyPi:

pip install kglab

If you work directly from this Git repo, be sure to install the dependencies as well:

pip install -r requirements.txt

If you would like to run a local Notebook install Jupyter Lab:

If you use conda, you can install it with:

conda install -c conda-forge jupyterlab

If you use pip, you can install it with:

pip install jupyterlab

If installing via pip install --user you must add the user-level bin directory to your PATH environment variable in order to launch JupyterLab.

If you are using a Unix derivative (FreeBSD, GNU / Linux, OS X), you can achieve this by using the export PATH="$HOME/.local/bin:$PATH" command.

Once installed, launch JupyterLab with:

jupyter-lab

Tutorial Outline

  1. Building a graph in RDF using rdflib
  • ex01_0.ipynb
    • examine the dataset
  • ex01_1.ipynb
    • construct a graph from RDF triples
    • using multiple namespaces
    • proper handling of literals
    • serialization to strings and files using Turtle and JSON-LD
  1. Leveraging the kglab abstraction layer
  • ex01_2.ipynb
    • construct and serialize the same graph using kglab
  1. Interactive graph visualization with pyvis
  1. Build a medium size KG from a CSV dataset
  • ex01_4.ipynb
    • iterate through a dataset, representing a recipe for each row
    • compare relative file sizes for different serialization formats
  1. Running SPARQL queries
  • ex01_5.ipynb
    • load the medium size KG from the earlier example
    • run a SPARQL query to identify recipes with special ingredients and cooking times
    • use SPARQL queries and post-processing to create annotations
  1. Graph algorithms with networkx
  • ex01_6.ipynb
    • load the medium size KG from the earlier example
    • run graph algorithms in networkx to analyze properties of the KG
  1. Statistical relational learning with pslpython
  • ex01_7.ipynb
    • use RDF to represent the "simple acquaintance" PSL example graph
    • load the graph into a KG
    • visualize the KG
    • run PSL to infer uncertainty in the knows relation for grounded nodes
  1. Vector embedding with gensim
  • ex01_8.ipynb
    • curating annotations
    • analyze ingredient labels from 250K recipes
    • use vector embedding to rank relatedness for labels
    • add string similarity for an approximate pareto archive

Production Use Cases

  • Derwen and its client projects

kg+lab

Kudos

Many thanks to our contributors: @jake-aft, @dmoore247, plus general support from Derwen, Inc. and The Knowledge Graph Conference.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kglab-0.1.3.tar.gz (9.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kglab-0.1.3-py3-none-any.whl (8.9 kB view details)

Uploaded Python 3

File details

Details for the file kglab-0.1.3.tar.gz.

File metadata

  • Download URL: kglab-0.1.3.tar.gz
  • Upload date:
  • Size: 9.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.28.1 CPython/3.7.4

File hashes

Hashes for kglab-0.1.3.tar.gz
Algorithm Hash digest
SHA256 734173b979f53f7f70161c8e58aee37ed1e03335e46bce56fada4d06702c54c5
MD5 03834f844c8f20194a85284ffdae3a37
BLAKE2b-256 c2d7cf36a0ed5cc8c0f7c67de229e2e8d3ef08c78729780d0be4a056e6c29757

See more details on using hashes here.

File details

Details for the file kglab-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: kglab-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 8.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.28.1 CPython/3.7.4

File hashes

Hashes for kglab-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 b20842168ad1449cc9457aa7deed0a64fe5f6575e790c240765b106305df9249
MD5 0147b1e8cc96086747f83824a4b18cfa
BLAKE2b-256 30fa88bc2c973cd44ea4eec783a51b4fe559f744778568765e5cbf815861cfe1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page