Skip to main content

Python-based implementation of PSyKE, i.e. a Platform for Symbolic Knowledge Extraction

Project description

PSyKE

PSyKE Logo

Quick links:

Latest Releases

  • PSyKE 1.0: Compatibility with Python 3.11.x
  • PSyKE 0.10: New genetic algorithms for knowledge extraction
  • PSyKE 0.9: Fairness mitigation support for knowedge extractors
  • PSyKE 0.8: New features: local explainability and counterfactual support
  • PSyKE 0.7: New SKE algorithms implemented

Intro

PSyKE (Platform for Symbolic Knowledge Extraction) is intended as a library for extracting symbolic knowledge (in the form of logic rule lists) out of sub-symbolic predictors.

More precisely, PSyKE offers a general purpose API for knowledge extraction, and a number of different algorithms implementing it, supporting both classification and regression problems. The extracted knowledge consists of a Prolog theory (i.e., a list of Horn clauses) or an OWL ontology containing SWRL rules.

PSyKE relies on 2ppy (tuProlog in Python) for logic support, which in turn is based on the 2p-Kt logic ecosystem.

Class diagram overview:

PSyKE class diagram

PSyKE is designed around the notion of extractor. More precisely, an Extractor is any object capable of extracting a logic Theory out of a trained sub-symbolic regressor or classifier. Accordingly, an Extractor is composed of (i) a trained predictor (i.e., black-box used as an oracle) and (ii) a set of feature descriptors, and it provides two methods:

  • extract: returns a logic theory given a dataset;
  • predict: predicts a value using the extracted rules (instead of the original predictor).

Currently, the supported extraction algorithms are:

  • CART, straightforward extracts rules from both classification and regression decision trees;
  • Classification:
    • REAL (Rule Extraction As Learning), generates and generalizes rules strarting from dataset samples;
    • Trepan, generates rules by inducing a decision tree and possibly exploiting m-of-n expressions;
  • Regression:
    • ITER, builds and iteratively expands hypercubes in the input space. Each cube holds a constant value, that is the estimated output for the samples inside the cube;
    • GridEx, extension of the ITER algorithm that produces shorter rule lists retaining higher fidelity w.r.t. the predictor.
    • GridREx, extension of GridEx where the output of each hypercube is a linear combination of the input variables and not a constant value.

Users may exploit the PEDRO algorithm, included in PSyKE, to tune the optimal values for GridEx and GridREx hyper-parameters.

We are working on PSyKE to extend its features to encompass explainable clustering tasks, as well as to make more general-purpose the supported extraction algorithms (e.g., by adding classification support to GridEx and GridREx).

Users

End users

PSyKE is deployed as a library on Pypi. It can be installed as Python package by running:

pip install psyke

Requirements

Please refer to the requirements file

Test requirements
  • skl2onnx
  • onnxruntime
  • parameterized

Once installed, it is possible to create an extractor from a predictor (e.g. Neural Network, Support Vector Machine, K-Nearest Neighbours, Random Forest, etc.) and from the data set used to train the predictor.

Note: the predictor must expose a method named predict to be properly used as an oracle.

End users

A brief example is presented in demo.py script in the demo/ folder. Using sklearn's Iris data set we train a K-Nearest Neighbours to predict the correct output class. Before training, we make the dataset discrete. After that we create two different extractors: REAL and Trepan. We output the extracted theory for both extractors.

REAL extracted rules:

iris(PetalLength, PetalWidth, SepalLength, SepalWidth, setosa) :- PetalWidth =< 1.0.
iris(PetalLength1, PetalWidth1, SepalLength1, SepalWidth1, versicolor) :- PetalLength1 > 4.9, SepalWidth1 in [2.9, 3.2].
iris(PetalLength2, PetalWidth2, SepalLength2, SepalWidth2, versicolor) :- PetalWidth2 > 1.6.
iris(PetalLength3, PetalWidth3, SepalLength3, SepalWidth3, virginica) :- SepalWidth3 =< 2.9.
iris(PetalLength4, PetalWidth4, SepalLength4, SepalWidth4, virginica) :- SepalLength4 in [5.4, 6.3].
iris(PetalLength5, PetalWidth5, SepalLength5, SepalWidth5, virginica) :- PetalWidth5 in [1.0, 1.6].

Trepan extracted rules:

iris(PetalLength6, PetalWidth6, SepalLength6, SepalWidth6, virginica) :- PetalLength6 > 3.0, PetalLength6 in [3.0, 4.9].
iris(PetalLength7, PetalWidth7, SepalLength7, SepalWidth7, versicolor) :- PetalLength7 > 3.0.
iris(PetalLength8, PetalWidth8, SepalLength8, SepalWidth8, setosa) :- true.

Developers

Working with PSyKE codebase requires a number of tools to be installed:

  • Python 3.11

    • Python version >= 3.12.x are currently not supported
  • JDK 11+ (please ensure the JAVA_HOME environment variable is properly configured)

  • Git 2.20+

Develop PSyKE with PyCharm

To participate in the development of PSyKE, we suggest the PyCharm IDE.

Importing the project

  1. Clone this repository in a folder of your preference using git_clone appropriately
  2. Open PyCharm
  3. Select Open
  4. Navigate your file system and find the folder where you cloned the repository
  5. Click Open

Developing the project

Contributions to this project are welcome. Just some rules:

  • We use git flow, so if you write new features, please do so in a separate feature/ branch
  • We recommend forking the project, developing your code, then contributing back via pull request
  • Commit often
  • Stay in sync with the develop (or master) branch (pull frequently if the build passes)
  • Do not introduce low quality or untested code

Issue tracking

If you meet some problems in using or developing PSyKE, you are encouraged to signal it through the project "Issues" section on GitHub.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

psyke-1.0.4.dev41.tar.gz (74.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

psyke-1.0.4.dev41-py3-none-any.whl (77.4 kB view details)

Uploaded Python 3

File details

Details for the file psyke-1.0.4.dev41.tar.gz.

File metadata

  • Download URL: psyke-1.0.4.dev41.tar.gz
  • Upload date:
  • Size: 74.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for psyke-1.0.4.dev41.tar.gz
Algorithm Hash digest
SHA256 b9b2ee4afe774c9384dec75a34e4ea94f210075df4d1f9c368d05259fda36d7a
MD5 aa8955ee16b724d7b7c3c8929c0237c9
BLAKE2b-256 585c6d5d59bc986f2e336e79088cfb8e98dc7aa9b93e514e377a827da0d58f26

See more details on using hashes here.

File details

Details for the file psyke-1.0.4.dev41-py3-none-any.whl.

File metadata

  • Download URL: psyke-1.0.4.dev41-py3-none-any.whl
  • Upload date:
  • Size: 77.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for psyke-1.0.4.dev41-py3-none-any.whl
Algorithm Hash digest
SHA256 88b6566f0e1ecad9c24cb19d4be1e50da364258fdc1c1f07333c689b6e7f89f8
MD5 7af08b6a2df27dc5b0f7e9ad86370ee5
BLAKE2b-256 0e56889780e46ba94bc9a9f183569098027dde2639412ccdf5a9acbfebac85dd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page