pyderivationagent is a python wrapper for derivation agents as part of The World Avatar project.
Project description
Description
The pyderivationagent
package provides a python wrapper for derivation agents as part of TheWorldAvatar project. It is a python equivalent of uk.ac.cam.cares.jps.base.agent.DerivationAgent.java
but based on Flask application behind a gunicorn server, inspired by the Example: Python agent. pyderivationagent
uses py4jps>=1.0.21
to access derivation operations provided in uk.ac.cam.cares.jps.base.derivation.DerivationClient.java
. For technical details, below are a few useful links:
py4jps
- python wrapper for jps-base-libDerivationClient
- derivation frameworkDerivationAgent.java
- derivation agent java class that uses methods provided inDerivationClient
DerivationAsynExample
- example on directed acyclic graph (DAG) of derivations operated by derivation agents
Installation
For development and testing reasons, follow below instructions to get a copy of the project up and running on your local system.
Virtual environment setup
It is highly recommended to install pyderivationagent
packages in a virtual environment. The following steps can be taken to build a virtual environment:
(Windows)
$ python -m venv <venv_name>
$ <venv_name>\Scripts\activate.bat
(<venv_name>) $
(Linux)
$ python3 -m venv <venv_name>
$ source <venv_name>/bin/activate
(<venv_name>) $
The above commands will create and activate the virtual environment <venv_name>
in the current directory.
Installation via pip
The following command can be used to install the pyderivationagent
package and agentlogging
package. This is a workaround as PyPI does NOT allow install_requires
direct links, so we could NOT add package agentlogging
from 'agentlogging @ git+https://github.com/cambridge-cares/TheWorldAvatar@main#subdirectory=Agents/utils/python-utils'
as dependency, therefore, in order to make the semi-automated release process working, we here introduce a workaround to install agentlogging to the virtual environment but NOT as dependency in the setup.py of pyderivationagent
. A long term solution could be that we publish agentlogging
in PyPI as well.
(<venv_name>) $ pip install pyderivationagent
(<venv_name>) $ pip install "git+https://github.com/cambridge-cares/TheWorldAvatar@main#subdirectory=Agents/utils/python-utils"
How to use
Develop derivation agent
When creating a new derivation python agent, it's strongly advised to stick to the folder structure below:
Recommended python agent folder layout
.
├── ... # other project files (README, LICENSE, etc..)
├── youragent # project source files for your agent
│ ├── agent # module that contains the agent logic
│ │ ├── __init__.py
│ │ └── your_agent.py # your derivation agent
│ ├── data_model # module that contains the dataclasses for concepts
│ │ ├── __init__.py
│ │ └── your_onto.py # dataclasses for your ontology concepts
│ ├── kg_operations # module that handles knowledge graph operations
│ │ ├── __init__.py
│ │ └── your_sparql.py # sparql query and update strings for your agent
│ ├── other_modules # other necessary modules for your agent
│ │ ├── __init__.py
│ │ ├── module1.py
│ │ └── module2.py
│ ├── entry_point.py # agent entry point
│ └── __init__.py
├── youragent.env.example # example environment variables for configuration
├── docker-compose.yml # docker compose file
├── Dockerfile # Dockerfile
└── tests # tests files
Example code
your_agent.py
from pyderivationagent import DerivationAgent
from pyderivationagent import DerivationInputs
from pyderivationagent import DerivationOutputs
from youragent.kg_operations import YourSparqlClient
from youragent.data_model import YOUR_CONCEPT
from rdflib import Graph
class YourAgent(DerivationAgent):
def process_request_parameters(self, derivation_inputs: DerivationInputs, derivation_outputs: DerivationOutputs):
# Provide your agent logic that converts the agent inputs to triples of new created instances
# The derivation_inputs will be in the format of key-value pairs with the concept as key and instance iri as value
# For example:
# {
# "https://www.example.com/triplestore/repository/Ontology.owl#Concept_1": ["https://www.example.com/triplestore/repository/Concept_1/Instance_1"],
# "https://www.example.com/triplestore/repository/Ontology.owl#Concept_2": ["https://www.example.com/triplestore/repository/Concept_2/Instance_2"],
# "https://www.example.com/triplestore/repository/Ontology.owl#Concept_3":
# ["https://www.example.com/triplestore/repository/Concept_3/Instance_3_1", "https://www.example.com/triplestore/repository/Concept_3/Instance_3_2"],
# "https://www.example.com/triplestore/repository/Ontology.owl#Concept_4": ["https://www.example.com/triplestore/repository/Concept_4/Instance_4"]
# }
# The instance IRIs of the interested class (rdf:type) can be accessed via derivation_inputs.getIris(clz)
# e.g. assume we will take the first iri that has rdf:type YOUR_CONCEPT
instance_iri = derivation_inputs.getIris(YOUR_CONCEPT)[0]
# You may want to create instance of YourSparqlClient for specific queries/updates you would like to perform
# YourSparqlClient should be defined in your_sparql.py that will be introduced later in this README.md file
# This client can be initialised with the configuration you already initialised in YourAgent.__init__ method
self.sparql_client = YourSparqlClient(
self.kgUrl, self.kgUpdateUrl, self.kgUser, self.kgPassword,
self.fs_url, self.fs_user, self.fs_password
)
# Please note here we are using instance_iri of class YOUR_CONCEPT within the provided derivation_inputs
response = self.sparql_client.your_sparql_query(instance_iri)
# You may want to log something during agent execution
self.logger.info("YourAgent has done something.")
self.logger.debug("And here are some details...")
# The new created instances should be added to the derivation_outputs in the form of triples
# e.g. <http://example/ExampleClass_UUID> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://example/ExampleClass>.
# <http://example/ExampleClass_UUID> <http://example/hasValue> 5.
derivation_outputs.createNewEntity("http://example/ExampleClass_UUID", "http://example/ExampleClass")
derivation_outputs.addTriple("http://example/ExampleClass_UUID", "http://example/hasValue", 5)
# Alternatively, you may create an instance of rdflib.Graph and add the whole graph to derivation_outputs
# In which case, the above triples can be added by:
g = Graph()
g.add((URIRef("http://example/ExampleClass_UUID"), RDF.type, URIRef("http://example/ExampleClass")))
g.add((URIRef("http://example/ExampleClass_UUID"), URIRef("http://example/hasValue"), Literal(5)))
derivation_outputs.addGraph(g)
your_onto.py
from __future__ import annotations # imported to enable pydantic postponed annotations
import pydantic
from typing import Optional
# Module pyderivationagent.data_model.iris.py provides some IRI of basic/common concepts and relationships
# In case you are defining new concepts in your project, you may put it here
YOUR_CONCEPT = 'https://www.example.com/triplestore/repository/Ontology.owl#YOUR_CONCEPT'
# You may also want to define dataclasses for some concepts you are working on
# Please note normally you need to define the concept before it gets referenced by others, as python is a scripting language
class YourOtherConcept(pydantic.BaseModel):
instance_iri: str
# However, if you have to reference other concepts that will have to appear at a later point in your script, for example, in the situation of self-referencing or circular referencing (both of which are quite common in ontologies), you may do so like:
class AnotherConcept(pydantic.BaseModel):
instance_iri: str
refers_to_oneself: AnotherConcept # Here the AnotherConcept self-reference
refers_to_yourconcept: YourConcept # This object relationship refers to YourConcept which will be defined in next lines
# For both reference to work, one must call update_forward_refs() after the definition of YourConcept to make it resolvable
class YourConcept(pydantic.BaseModel):
instance_iri: str
# Here you may want to declare an object relationship that points to YourOtherConcept
# However, you may want to make it Optional with a default None
example_object_relationship_to: Optional[YourOtherConcept] = None
circular_reference_to: AnotherConcept # Here we complete the circular-reference situation
# Call update_forward_refs() to make AnotherConcept resolvable
# For more details, please see https://pydantic-docs.helpmanual.io/usage/postponed_annotations/
AnotherConcept.update_forward_refs()
your_sparql.py
from pyderivationagent import PySparqlClient
from youragent.data_model import YourConcept
class YourSparqlClient(PySparqlClient):
# PySparqlClient class provides a few utility functions that developer can call within their own functions:
# - checkInstanceClass(self, instance, instance_class)
# - getAmountOfTriples(self)
# - performQuery(self, query)
# - performUpdate(self, update)
# - uploadOntology(self, filePath)
# - uploadFile(self, local_file_path)
# - downloadFile(self, remote_file_path, downloaded_file_path)
def your_sparql_query(self, your_instance_iri: str) -> YourConcept:
# Construct your SPARQL query string
query = """SELECT ?target WHERE {?target ?predicate ?object.}"""
# Perform SPARQL query with provided method performQuery in PySparqlClient class
response = self.performQuery(query)
# Instantiate TargetClass instances with query results
new_target_instance = YourConcept(response[0]['target'])
return new_target_instance
def your_sparql_update(self, your_instance_iri: str):
# Construct your SPARQL update string
update = """INSERT DATA {<http://subject> <http://predicate> <http://object>.}"""
# Perform SPARQL update with provided method performUpdate in PySparqlClient class
self.performUpdate(update)
entry_point.py
from pyderivationagent.conf import config_derivation_agent
from youragent.agent import YourAgent
def create_app():
# If you would like to deploy your agent within a docker container (using docker-compose.yml and youragent.env which will be introduced later in this README.md file), then you may use:
agent_config = config_derivation_agent()
# Else, if you would like to create agent to run in your memory, then you may want to provide the path to youragent.env file as argument to function config_derivation_agent(), i.e.,
# agent_config = config_derivation_agent("/path/to/youragent.env")
# Create agent instance
agent = YourAgent(
agent_iri = agent_config.ONTOAGENT_SERVICE_IRI,
time_interval = agent_config.DERIVATION_PERIODIC_TIMESCALE,
derivation_instance_base_url = agent_config.DERIVATION_INSTANCE_BASE_URL,
kg_url = agent_config.SPARQL_QUERY_ENDPOINT,
kg_update_url = agent_config.SPARQL_UPDATE_ENDPOINT,
kg_user = agent_config.KG_USERNAME,
kg_password = agent_config.KG_PASSWORD,
fs_url = agent_config.FILE_SERVER_ENDPOINT,
fs_user = agent_config.FILE_SERVER_USERNAME,
fs_password = agent_config.FILE_SERVER_PASSWORD,
agent_endpoint = agent_config.ONTOAGENT_OPERATION_HTTP_URL,
app = Flask(__name__)
flask_config = FlaskConfig(),
logger_name = "dev"
)
# Start listening sync/monitoring async derivations
agent.start_monitoring_derivations()
# Expose flask app of agent
app = agent.app
return app
youragent.env.example
ONTOAGENT_SERVICE_IRI=http://www.example.com/resources/agents/Service__Example#Service
DERIVATION_PERIODIC_TIMESCALE=60
DERIVATION_INSTANCE_BASE_URL=http://www.example.com/triplestore/repository/
SPARQL_QUERY_ENDPOINT=http://www.example.com/blazegraph/namespace/kb/sparql
SPARQL_UPDATE_ENDPOINT=http://www.example.com/blazegraph/namespace/kb/sparql
KG_USERNAME=
KG_PASSWORD=
FILE_SERVER_ENDPOINT=http://www.example.com/FileServer/
FILE_SERVER_USERNAME=
FILE_SERVER_PASSWORD=
ONTOAGENT_OPERATION_HTTP_URL=/Example
You may want to commit this example file without credentials to git as a template for your agent configuration. At deployment, you can make a copy of this file, rename it to youragent.env
and populate the credentials information. It is suggested to add *.env
entry to your .gitignore
of the agent folder, thus the renamed youragent.env
(including credentials) will NOT be committed to git.
NOTE: you may want to provide SPARQL_QUERY_ENDPOINT
and SPARQL_UPDATE_ENDPOINT
as the internal port of the triple store (most likely blazegraph) docker container, e.g., http://blazegraph:8080/blazegraph/namespace/kb/sparql
(blazegraph:8080
depends on your specification in the docker-compose.yml
which will be introduced later in this README.md file), if you would like to deploy your derivation agent and the triple store within the same docker stack. At deployment, configurations in this file will be picked up by config_derivation_agent()
when instantiating the agent in create_app()
of entry_point.py
.
docker-compose.yml
version: '3.8'
services:
# Your derivation agent
your_agent:
image: your_agent:1.0.0
container_name: your_agent
environment:
LOG4J_FORMAT_MSG_NO_LOOKUPS: "true"
build:
context: .
dockerfile: ./Dockerfile
ports:
- 7000:5000
env_file:
- ./youragent.env
# Blazegraph
blazegraph:
image: docker.cmclinnovations.com/blazegraph:1.0.0-SNAPSHOT
container_name: "blazegraph_test"
ports:
- 27149:8080
environment:
BLAZEGRAPH_PASSWORD_FILE: /run/secrets/blazegraph_password
# Add a secret to set the password for BASIC authentication
secrets:
- blazegraph_password
# Secrets used to set runtime passwords
secrets:
blazegraph_password:
file: tests/dummy_services_secrets/blazegraph_passwd.txt
You may refer to DoEAgent for a concrete implementation of the above suggested folder structure based on pyderivationagent
. The design of pyderivationagent
is continually evolving, and as the project grows, we hope to make it more accessible to developers and users.
Dockerised integration test
The pyderivationagent
package also provides two sets of dockerised integration tests, following the same context as DerivationAsynExample
. Interested developer may refer to the README of the Java example for more context, or TheWorldAvatar/JPS_BASE_LIB/python_derivation_agent/tests
for more technical details.
One may execute below commands for each set of dockerised integration test:
-
Agents instantiated and run in memory, operating on a blazegraph docker container with dynamic port decided on deployment
(Linux)
$ cd /absolute_path_to/TheWorldAvatar/JPS_BASE_LIB/python_derivation_agent $ pytest -s ./tests/test_derivation_agent.py
-
All agents and blazegraph deployed within the same docker stack and they communicate via internal port address
(Linux)
$ cd /absolute_path_to/TheWorldAvatar/JPS_BASE_LIB/python_derivation_agent $ pytest -s --docker-compose=./docker-compose.test.yml ./tests/test_docker_integration.py
Ideally, we would like to provide this set of dockerised integration test to demo how one may develop integration test for derivation agents. Any ideas/discussions/issues/PRs on how to make this more standardised and accessible to developers are more than welcome.
New features and package release
Developers who add new features to the python agent wrapper handle the distribution of package pyderivationagent
on PyPI and Test-PyPI. If you want to add new features that suit your project and release the package independently, i.e. become a developer/maintainer, please contact the repository's administrator to indicate your interest.
The release procedure is currently semi-automated and requires a few items:
- Your Test-PyPI and PyPI account and password
- The version number x.x.x for the release
- Clone of
TheWorldAvatar
repository on your local machine - Docker-desktop is installed and running on your local machine
- You have access to the docker.cmclinnovations.com registry on your local machine, for more information regarding the registry, see: https://github.com/cambridge-cares/TheWorldAvatar/wiki/Docker%3A-Image-registry
Please create and checkout to a new branch from your feature branch once you are happy with the feature and above details are ready. The release process can then be started by using the commands below, depending on the operating system you're using. (REMEMBER TO CHANGE THE CORRECT VALUES IN THE COMMANDS BELOW!)
WARNING: at the moment, releasing package from Windows OS has an issue that the package will not be installed correctly while executing "$PIPPATH --disable-pip-version-check install -e $SPATH"[dev]"" in install_script_pip.sh. Please use Linux to release future versions before the issue is fixed.
(Windows)
$ cd \absolute_path_to\TheWorldAvatar\JPS_BASE_LIB\python_derivation_agent
$ release_pyderivationagent_to_pypi.sh -v x.x.x
(Linux)
$ cd /absolute_path_to/TheWorldAvatar/JPS_BASE_LIB/python_derivation_agent
$ ./release_pyderivationagent_to_pypi.sh -v x.x.x
Please follow the instructions presented in the console once the process has begun. If everything goes well, change the version number in JPS_BASE_LIB/python_derivation_agent/release_pyderivationagent_to_pypi.sh
to the one you used for the script release.
echo "./release_pyderivationagent_to_pypi.sh -v 0.0.1 - release version 0.0.1"
The changes mentioned above should then be committed with the changes performed automatically during the release process, specifically in python script JPS_BASE_LIB/python_derivation_agent/pyderivationagent/__init__.py
__version__ = "0.0.1"
and JPS_BASE_LIB/python_derivation_agent/setup.py
version='0.0.1',
Finally, merge the release branch back to the feature branch and make a Pull Request for the feature branch to be merged back into the main
branch.
Authors
Jiaru Bai (jb2197@cam.ac.uk)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for pyderivationagent-1.0.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 96af0f6ff5af7ff320fa1ce0b711a805ebe3ff1720135713bd431947c2a882e5 |
|
MD5 | f3160f2a906de24e968fa1722c2ba834 |
|
BLAKE2b-256 | a07c717039e49259fa646ac9bcddfb494ad5beffde3983c663b0cdf776f38150 |