Skip to main content

Library to access Wacom's Personal Knowledge graph.

Project description

Wacom Personal Knowledge Library

The library and the cloud services are still under development. The required access tokens are only available for selected partner companies.

:danger: Its is still under development, so we do not recommend using it yet for production environments. Moreover, it is not following any formal QA and release process, yet.

Introduction

In knowledge management there is a distinction between data, information and knowledge. In the domain of digital ink this means:

  • Data - The equivalent would be the ink strokes
  • Information - After using handwriting-, shape-, math-, or other recognition processes ink strokes are converted into machine readable content, such as text, shapes, math representations, other other digital content
  • Knowledge / Semantics - Beyond recognition content needs to be semantically analysed to become semantically understood based on a shared common knowledge.

The following illustration shows the different layers of knowledge: Levels of ink knowledge layers

For handling semantics, Wacom introduced the Wacom Personal Knowledge (WPK) cloud service to manage personal ontologies and its associated personal knowledge graph.

This library provide simplified access to Wacom's personal knowledge cloud service. It contains:

  • Basic datastructures for Ontology object and entities from the knowledge graph
  • Clients for the REST APIs
  • Connector for Wikidata public knowledge graph

Ontology service:

  • List all Ontology structures
  • Modify Ontology structures
  • Delete Ontology structures

Entity service:

  • List all entities
  • Add entities to knowledge graph
  • Access object properties

Technology stack

Domain Knowledge

The tasks of the ontology within Wacom's personal knowledge system is to formalised the domain the technology is used in, such as education-, smart home-, or creative domein. The domain model will be the foundation for the entities collected within the knowledge graph, describing real world concepts in a formal language understood by artificial intelligence system:

  • Foundation for structured data, knowledge representation as concepts and relations among concepts
  • Being explicit definitions of shared vocabularies for interoperability
  • Being actionable fragments of explicit knowledge that engines can use for inferencing (Reasoning)
  • Can be used for problem solving

An ontology defines (specifies) the concepts, relationships, and other distinctions that are relevant for modelling a domain.

Knowledge Graph

  • Knowledge graph is generated from unstructured and structured knowledge sources
  • Contains all structured knowledge gathered from all sources
  • Foundation for all semantic algorithms

Semantic Technology

  • Extract knowledge from various sources (Connectors)
  • Linking words to knowledge entities from graph in a given text (Ontology-based Named Entity Linking)
  • Enables a smart search functionality which understands the context and finds related documents (Semantic Search)

Functionality

Access API

The personal knowledge graph backend is implement as a multi-tenancy system. Thus, several tenants can be logically separated from each other and different organisations can build their one knowledge graph.

Tenant concept

In general, a tenant with their users, groups, and entities are logically separated. Physically the entities are store in the same instance of the Wacom Personal Knowledge (WPK) backend database system.

The user management is rather limited, each organisation must provide their own authentication service and user management. The backend only has a reference of the user (“shadow user”) by an external user id.

The management of tenants is limited to the system owner - Wacom -, as it requires a tenant management API key. While users for each tenant can be created by the owner of the Tenant API Key. You will receive this token from the system owner after the creation of the tenant.

:warning: Store the Tenant API Key in a secure key store, as attackers can use the key to harm your system.

The Tenant API Key should be only used by your authentication service to create shadow users and to login your user into the WPK backend. After a successful user login, you will receive a token which can be used by the user to create, update, or delete entities and relations.

The following illustration summarizes the flows for creation of tenant and users:

Tenant and user creation

The organisation itself needs to implement their own authentication service which:

  • handles the users and their passwords,
  • controls the personal data of the users,
  • connects the users with the WPK backend and share with them the user token.

The WPK backend only manages the access levels of the entities and the group management for users. The illustration shows how the access token is received from the WPK endpoint:

Access token request.

Entity API

The entities used within the knowledge graph and the relationship among them is defined within an ontology that is manage with Wacom Ontology Management System (WOMS).

An entity within the personal knowledge graphs consist of these major parts:

  • Icon - a visual representation of the entity, for instance a portrait of a person.
  • URI - a unique resource identifier of an entity in the graph.
  • Type - the type links to the defined concept class in the ontology.
  • Labels - labels are the word(s) use in a language for the concept.
  • Description - a short abstract that describes the entity.
  • Literals - literals are properties of an entity, such as first name of a person. The ontology defines all literals of the concept class as well as its data type.
  • Relations - the relationship among different entities is described using relations.

The following illustration provides an example for an entity:

Entity description

Entity content

Entities in general are language-independent as across nationalities or cultures we only use different scripts and words for a shared instance of a concept.

Let's take Leonardo da Vinci as an example. The ontology defines the concept of a Person, a human being. Now, in English its label would be Leonardo da Vinci, while in Japanese レオナルド・ダ・ヴィンチ. Moreover, he is also known as Leonardo di ser Piero da Vinci or ダ・ビンチ.

Labels

Now, in the given example all words that a assigned to the concept are labels. The label Leonardo da Vinci is stored in the backend with an additional language code, e.g. en.

There is always a main label, which refers to the most common or official name of entity. Another example would be Wacom, where Wacom Co., Ltd. is the official name while Wacom is commonly used and be considered as an alias.

:info: For the language code the ISO 639-1:2002, codes for the representation of names of languages—Part 1: Alpha-2 code. Read more, here

Samples

Entity handling

This samples shows how to work with graph service.

import urllib3
from typing import Optional, List, Dict

from knowledge.base.entity import LanguageCode, Description, Label
from knowledge.base.ontology import OntologyClassReference, OntologyPropertyReference, ThingObject, ObjectProperty
from knowledge.services.graph import WacomKnowledgeService
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

# ------------------------------- User credential ----------------------------------------------------------------------
TENANT_KEY: str = '<TENANT_ID>'
EXTERNAL_USER_ID: str = '<EXTERNAL_USER_ID>'
# ------------------------------- Knowledge entities -------------------------------------------------------------------
LEONARDO_DA_VINCI: str = 'Leonardo da Vinci'
SELF_PORTRAIT_STYLE: str = 'self-portrait'
# ------------------------------- Ontology class names -----------------------------------------------------------------
THING_OBJECT: OntologyClassReference = OntologyClassReference('wacom', 'core', 'Thing')
"""
The Ontology will contain a Thing class where is the root class in the hierarchy. 
"""
ARTWORK_CLASS: OntologyClassReference = OntologyClassReference('wacom', 'creative', 'VisualArtwork')
PERSON_CLASS: OntologyClassReference = OntologyClassReference('wacom', 'core', 'Person')
ART_STYLE_CLASS: OntologyClassReference = OntologyClassReference.parse('wacom:creative#ArtStyle')
IS_CREATOR: OntologyPropertyReference = OntologyPropertyReference('wacom', 'core', 'created')
HAS_TOPIC: OntologyPropertyReference = OntologyPropertyReference.parse('wacom:core#hasTopic')
CREATED: OntologyPropertyReference = OntologyPropertyReference.parse('wacom:core#created')
HAS_ART_STYLE: OntologyPropertyReference = OntologyPropertyReference.parse('wacom:creative#hasArtstyle')

if __name__ == '__main__':
    # Wacom personal knowledge REST API Client
    knowledge_client: WacomKnowledgeService = WacomKnowledgeService(
        application_name="Wacom Knowledge Listing",
        service_url='https://semantic-ink-private.wacom.com')
    knowledge_client.verify_calls = False  # TODO: Remove if it is officially deployed
    # Use special tenant for testing:  Unit-test tenant
    user_token: str = knowledge_client.request_user_token(TENANT_KEY, EXTERNAL_USER_ID)
    page_id: Optional[str] = None
    page_number: int = 1
    entity_count: int = 0
    print('-----------------------------------------------------------------------------------------------------------')
    print(' First step: Find Leonardo da Vinci in the knowledge graph.')
    print('-----------------------------------------------------------------------------------------------------------')
    res_entities, next_search_page = knowledge_client.search_labels(auth_key=user_token, search_term=LEONARDO_DA_VINCI,
                                                                    language_code=LanguageCode('en_US'), limit=1000)
    leo: Optional[ThingObject] = None
    s_idx: int = 1
    for entity in res_entities:
        #  Entity must be a person and the label match with full string
        if entity.concept_type == PERSON_CLASS and LEONARDO_DA_VINCI in [l.content for l in entity.label]:
            leo = entity
            break

    print('-----------------------------------------------------------------------------------------------------------')
    print(' What artwork exists in the knowledge graph.')
    print('-----------------------------------------------------------------------------------------------------------')
    relations_dict: Dict[OntologyPropertyReference, ObjectProperty] = knowledge_client.relations(auth_key=user_token,
                                                                                                 uri=leo.uri)
    print(f' Artwork of {leo.label}')
    print('-----------------------------------------------------------------------------------------------------------')
    idx: int = 1
    if CREATED in relations_dict:
        for e in relations_dict[CREATED].outgoing_relations:
            print(f' [{idx}] {e.uri}: {e.label}')
            idx += 1
    print('-----------------------------------------------------------------------------------------------------------')
    print(' Let us create a new piece of artwork.')
    print('-----------------------------------------------------------------------------------------------------------')

    # Main labels for entity
    artwork_labels: List[Label] = [
        Label('Ginevra Gherardini', LanguageCode('en_US')),
        Label('Ginevra Gherardini', LanguageCode('de_DE'))
    ]
    # Alias labels for entity
    artwork_alias: List[Label] = [
        Label("Ginevra", LanguageCode('en_US')),
        Label("Ginevra", LanguageCode('de_DE'))
    ]
    # Topic description
    artwork_description: List[Description] = [
        Description('Oil painting of Mona Lisa\' sister', LanguageCode('en_US')),
        Description('Ölgemälde von Mona Lisa\' Schwester', LanguageCode('de_DE'))
    ]
    # Topic
    artwork_object: ThingObject = ThingObject(label=artwork_labels, concept_type=ARTWORK_CLASS,
                                              description=artwork_description)
    artwork_object.alias = artwork_alias
    print(f' Create: {artwork_object}')
    # Create artwork
    artwork_entity_uri: str = knowledge_client.create_entity(user_token, artwork_object)
    print(f' Entity URI: {artwork_entity_uri}')

    # Create relation between Leonardo da Vinci and artwork
    knowledge_client.create_relation(auth_key=user_token, source=leo.uri, relation=IS_CREATOR,
                                     target=artwork_entity_uri)

    relations_dict = knowledge_client.relations(auth_key=user_token, uri=artwork_entity_uri)
    for ontology_property, object_property in relations_dict.items():
        print(f'  {object_property}')
    # You will see that wacom:core#isCreatedBy is automatically inferred as relation as it is the inverse property of
    # wacom:core#created.

    # Now, more search options
    res_entities, next_search_page = knowledge_client.search_description(user_token, 'Michelangelo\'s Sistine Chapel',
                                                                         LanguageCode('en_US'), limit=1000)
    print('-----------------------------------------------------------------------------------------------------------')
    print(' Search results.  Description: "Michelangelo\'s Sistine Chapel"')
    print('-----------------------------------------------------------------------------------------------------------')
    s_idx: int = 1
    for e in res_entities:
        print(e)

    # Now, let's search all artwork that has the art style self-portrait
    res_entities, next_search_page = knowledge_client.search_labels(auth_key=user_token,
                                                                    search_term=SELF_PORTRAIT_STYLE,
                                                                    language_code=LanguageCode('en_US'), limit=1000)
    art_style: Optional[ThingObject] = None
    s_idx: int = 1
    for entity in res_entities:
        #  Entity must be a person and the label match with full string
        if entity.concept_type == ART_STYLE_CLASS and SELF_PORTRAIT_STYLE in [l.content for l in entity.label]:
            art_style = entity
            break
    res_entities, next_search_page = knowledge_client.search_relation(auth_key=user_token,
                                                                      subject_uri=None,
                                                                      relation=HAS_ART_STYLE,
                                                                      object_uri=art_style.uri,
                                                                      language_code=LanguageCode('en_US'))
    print('-----------------------------------------------------------------------------------------------------------')
    print(' Search results.  Relation: relation:=has_topic  object_uri:= unknown')
    print('-----------------------------------------------------------------------------------------------------------')
    s_idx: int = 1
    for e in res_entities:
        print(e)
        s_idx += 1

    # Finally, the activation function retrieving the related identities to a pre-defined depth.
    entities, relations = knowledge_client.activations(auth_key=user_token,
                                                       uris=[leo.uri],
                                                       depth=1)
    print('-----------------------------------------------------------------------------------------------------------')
    print(f'Activation.  URI: {leo.uri}')
    print('-----------------------------------------------------------------------------------------------------------')
    s_idx: int = 1
    for e in res_entities:
        print(e)
        s_idx += 1
    # All relations
    print('-----------------------------------------------------------------------------------------------------------')
    for r in relations:
        print(f'Subject: {r[0]} Predicate: {r[1]} Object: {r[2]}')
    print('-----------------------------------------------------------------------------------------------------------')
    page_id = None

    # Listing all entities which have the type
    idx: int = 1
    while True:
        # pull
        entities, total_number, next_page_id = knowledge_client.listing(user_token, ART_STYLE_CLASS, page_id=page_id,
                                                                        limit=100)
        pulled_entities: int = len(entities)
        entity_count += pulled_entities
        print('-------------------------------------------------------------------------------------------------------')
        print(f' Page: {page_number} Number of entities: {len(entities)}  ({entity_count}/{total_number}) '
              f'Next page id: {next_page_id}')
        print('-------------------------------------------------------------------------------------------------------')
        for e in entities:
            print(e)
            idx += 1
        if pulled_entities == 0:
            break
        page_number += 1
        page_id = next_page_id
    print()
    # Delete all personal entities for this user
    while True:
        # pull
        entities, total_number, next_page_id = knowledge_client.listing(user_token, THING_OBJECT, page_id=page_id,
                                                                        limit=100)
        pulled_entities: int = len(entities)
        if pulled_entities == 0:
            break
        delete_uris: List[str] = [e.uri for e in entities]
        print(f'Cleanup. Delete entities: {delete_uris}')
        knowledge_client.delete_entities(auth_key=user_token, uris=delete_uris, force=True)
        page_number += 1
        page_id = next_page_id
    print('-----------------------------------------------------------------------------------------------------------')

Named Entity Linking

Performing Named Entity Linking (NEL) on text and Universal Ink Model.

from typing import List, Dict

import urllib3

from knowledge.services.graph import WacomKnowledgeService
from knowledge.base.entity import LanguageCode
from knowledge.nel.base import KnowledgeGraphEntity
from knowledge.nel.engine import WacomEntityLinkingEngine
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

# Constants
LANGUAGE_CODE: LanguageCode = LanguageCode("en_US")
TEXT: str = "Leonardo da Vinci painted the Mona Lisa."
# User credential
TENANT_KEY: str = '<TENANT_ID>'
EXTERNAL_USER_ID: str = '<EXTERNAL_USER_ID>'

if __name__ == '__main__':
    # Wacom personal knowledge REST API Client
    knowledge_client: WacomKnowledgeService = WacomKnowledgeService(
        application_name="Named Entity Linking Knowledge access",
        service_url='https://semantic-ink-private.wacom.com')
    knowledge_client.verify_calls = False  # TODO: Remove if it is officially deployed

    #  Wacom Named Entity Linking
    nel_client: WacomEntityLinkingEngine = WacomEntityLinkingEngine(
        service_url=WacomEntityLinkingEngine.SERVICE_URL,
        service_endpoint=WacomEntityLinkingEngine.SERVICE_ENDPOINT
    )
    nel_client.verify_calls = False  # TODO: Remove if it is officially deployed
    # Use special tenant for testing:  Unit-test tenant
    user_token: str = nel_client.request_user_token(TENANT_KEY, EXTERNAL_USER_ID)
    entities: List[KnowledgeGraphEntity] = nel_client.\
        link_personal_entities(auth_key=user_token, text=TEXT,
                               language_code=LANGUAGE_CODE)
    idx: int = 1
    print('-----------------------------------------------------------------------------------------------------------')
    print(f'Text: "{TEXT}"@{LANGUAGE_CODE}')
    print('-----------------------------------------------------------------------------------------------------------')
    for e in entities:
        print(e)
        idx += 1

Tools

The following samples show how to utilize the library to work with Wacom's Personal Knowledge.

Listing script

Listing the entities for tenant.

>> python listing.py [-h] [-u USER] [-t TENANT] [-r]

Parameters:

  • -u USER, --user USER - External ID to identify user of the Wacom Personal Knowledge
  • -t TENANT, --tenant TENANT - Tenant key to identify tenant
  • -r, --relations (optional) - Requesting the relations for each entity

Dump script

Dump all entities of a user to a ndjson file.

>> python  dump.py [-h] [-u USER] [-t TENANT] [-r] [-d DUMP]

Parameters:

  • -u USER, --user USER - External ID to identify user of the Wacom Personal Knowledge
  • -t TENANT, --tenant TENANT - Tenant key to identify tenant
  • -r, --relations (optional) - Requesting the relations for each entity
  • -d DUMP, --dump DUMP - Defines the location of an ndjson file

Push entities script

Pushing entities to knowledge graph.

>> python push_entities.py [-h] [-u USER] [-t TENANT] [-r]

Parameters:

  • -u USER, --user USER - External ID to identify user of the Wacom Personal Knowledge
  • -t TENANT, --tenant TENANT - Tenant key to identify tenant
  • -i CACHE, --cache CACHE - Path to entities that must be imported.

Documentation

You can find more detailed technical documentation, here. API documentation is available here.

Contributing

Contribution guidelines are still work in progress.

License

Apache License 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

personal_knowledge_library-0.1.3.tar.gz (146.3 kB view details)

Uploaded Source

Built Distribution

personal_knowledge_library-0.1.3-py3-none-any.whl (146.9 kB view details)

Uploaded Python 3

File details

Details for the file personal_knowledge_library-0.1.3.tar.gz.

File metadata

  • Download URL: personal_knowledge_library-0.1.3.tar.gz
  • Upload date:
  • Size: 146.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.9.7

File hashes

Hashes for personal_knowledge_library-0.1.3.tar.gz
Algorithm Hash digest
SHA256 9655d3aa71943d1bc68dc9d7a42e1e64f043d054743dc1af690814435202e16b
MD5 bafa0b891a549d34e511b1a5479e762b
BLAKE2b-256 29d68a9b3499fa853b7ed1cb484f2298bed843f19e90215a177a5189dacd2da1

See more details on using hashes here.

File details

Details for the file personal_knowledge_library-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: personal_knowledge_library-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 146.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.9.7

File hashes

Hashes for personal_knowledge_library-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 aed3e22e3dd82505d27a9b7ba4ee84a0407d7688a74091b564c5fab9e0984d8b
MD5 e6d6adffe6e356c997c3da8350c5350a
BLAKE2b-256 9340f1256a6e3e079eba2fad1794d7102ab2053f7e8c7e7c759c7e1858096e0b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page