Skip to main content

Python-based implementation of PSyKE, i.e. a Platform for Symbolic Knowledge Extraction

Project description

PSyKE

Some quick links:

Reference paper

Federico Sabbatini, Giovanni Ciatto, Roberta Calegari, Andrea Omicini. "On the Design of PSyKE: A Platform for Symbolic Knowledge Extraction", in: WOA 2021 – 22nd Workshop “From Objects to Agents”, Aachen, Sun SITE Central Europe, RWTH Aachen University, 2021, 2963, pp. 29 - 48.

Bibtex:

@inproceedings{psyke-woa2021,
	articleno = 3,
	author = {Sabbatini, Federico and Ciatto, Giovanni and Calegari, Roberta and Omicini, Andrea},
	booktitle = {WOA 2021 -- 22nd Workshop ``From Objects to Agents''},
	editor = {Calegari, Roberta and Ciatto, Giovanni and Denti, Enrico and Omicini, Andrea and Sartor, Giovanni},
	issn = {1613-0073},
	keywords = {explainable AI, knowledge extraction, interpretable prediction, PSyKE},
	location = {Bologna, Italy},
	month = oct,
	note = {22nd Workshop ``From Objects to Agents'' (WOA 2021), Bologna, Italy, 1--3~} # sep # {~2021. Proceedings},
	numpages = 20,
	pages = {29--48},
	publisher = {Sun SITE Central Europe, RWTH Aachen University},
	series = {CEUR Workshop Proceedings},
	subseries = {AI*IA Series},
	title = {On the Design of {PSyKE}: A Platform for Symbolic Knowledge Extraction},
	url = {http://ceur-ws.org/Vol-2963/paper14.pdf},
	volume = 2963,
	year = 2021
}

Intro

PSyKE (Platform for Symbolic Knowledge Extraction) is intended as a library for extracting symbolic knowledge (in the form of logic rules) out of sub-symbolic predictors.

More precisely, PSyKE offers a general purpose API for knowledge extraction, and a number of different algorithms implementing it, supporting both classification and regression problems. The extracted knowledge consists of a Prolog theory (i.e. a list of Horn clauses).

PSyKE relies on 2ppy (tuProlog in Python) for logic support, which is in turn based on the 2p-Kt logic ecosystem.

Class diagram overview:

PSyKE class diagram

PSyKE is designed around the notion of extractor. More precisely, an Extractor is any object capable of extracting a logic Theory out of a trained sub-symbolic regressor or classifier. Accordingly, any Extractor is composed of (i) a trained predictor (i.e., black-box used as an oracle) and (ii) a set of discrete feature descriptors (as some algorithms require a discrete dataset), and it provides two methods:

  • extract: given a dataset it returns a logic theory;
  • predict: predicts a value using the extracted rules instead of the original predictor.

Currently, the supported extraction algorithms are:

  • CART, straightforward extracts rules from both classification and regression decision trees;
  • Classification:
    • REAL (Rule Extraction As Search), generates a rule for each sample in the dataset if the sample isn't covered yet. Before ending the extraction the rules set is optimized;
    • Trepan, first it generates a decision tree using m-of-n expressions, than it extracts rule from it;
  • Regression:
    • ITER, builds and iteratively expands hypercubes in the input space. Each cube holds the estimated value of the regression for the inputs that are inside the cube. Rules are generated from the cubes' dimensions;
    • Gridex, coming soon.

Users

End users

PSyKE is deployed as a library on Pypi, and it can therefore be installed as Python package by running:

pip install psyke

Requirements

  • numpy 1.21.3+
  • pandas 1.3.4+
  • scikit-learn 1.0.1+
  • 2ppy 0.3.3+
Test requirements
  • skl2onnx 1.10.0+
  • onnxruntime 1.9.0+
  • parameterized 0.8.1+

Once installed, one can create an extractor from a predictor (e.g. Neural Networks, Support Vector Machines, K-Nearest Neighbor, Random Forests, etc.) and from the dataset used to train the predictor.

Note: the predictor must expose a method named predict to be properly used as oracle.

End users

A brief example is presented in demo.py script. Using sklearn iris dataset we train a K-Nearest Neighbor to predict the correct iris class. Before training, we make the dataset discrete. After that we create two different extractors: REAL and Trepan. We output the extracted theory for both extractors.

REAL extracted rules:

iris(PetalLength_0, PetalWidth_0, SepalLength_0, SepalWidth_0, setosa) :- '=<'(PetalWidth_0, 0.65).
iris(PetalLength_1, PetalWidth_1, SepalLength_1, SepalWidth_1, versicolor) :- ('>'(PetalLength_1, 4.87), '>'(SepalLength_1, 6.26)).
iris(PetalLength_2, PetalWidth_2, SepalLength_2, SepalWidth_2, versicolor) :- '>'(PetalWidth_2, 1.64).
iris(PetalLength_3, PetalWidth_3, SepalLength_3, SepalWidth_3, virginica) :- '=<'(SepalWidth_3, 2.87).
iris(PetalLength_4, PetalWidth_4, SepalLength_4, SepalWidth_4, virginica) :- in(SepalLength_4, [5.39, 6.26]).
iris(PetalLength_5, PetalWidth_5, SepalLength_5, SepalWidth_5, virginica) :- in(PetalWidth_5, [0.65, 1.64]).

Trepan extracted rules:

iris(PetalLength_6, PetalWidth_6, SepalLength_6, SepalWidth_6, virginica) :- ('>'(PetalLength_6, 2.28), in(PetalLength_6, [2.28, 4.87])).
iris(PetalLength_7, PetalWidth_7, SepalLength_7, SepalWidth_7, versicolor) :- '>'(PetalLength_7, 2.28).
iris(PetalLength_8, PetalWidth_8, SepalLength_8, SepalWidth_8, setosa) :- true.

Developers

Working with PSyKE codebase requires a number of tools to be installed:

  • Python 3.9+
  • JDK 11+ (please ensure the JAVA_HOME environment variable is properly configured)
  • Git 2.20+

Develop PSyKE with PyCharm

To participate in the development of PSyKE, we suggest the PyCharm IDE.

Importing the project

  1. Clone this repository in a folder of your preference using git_clone appropriately
  2. Open PyCharm
  3. Select Open
  4. Navigate your file system and find the folder where you cloned the repository
  5. Click Open

Developing the project

Contributions to this project are welcome. Just some rules:

  • We use git flow, so if you write new features, please do so in a separate feature/ branch
  • We recommend forking the project, developing your stuff, then contributing back vie pull request
  • Commit often
  • Stay in sync with the develop (or master) branch (pull frequently if the build passes)
  • Do not introduce low quality or untested code

Issue tracking

If you meet some problem in using or developing PSyKE, you are encouraged to signal it through the project "Issues" section on GitHub.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

psyke-0.1.0.dev26.tar.gz (34.9 kB view hashes)

Uploaded Source

Built Distribution

psyke-0.1.0.dev26-py3-none-any.whl (28.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page