A state of the art knowledge base
Project description
ZincBase is a state of the art knowledge base and complex simulation suite. It does the following:
- Store and retrieve graph structured data efficiently.
- Provide ways to query the graph, including via bleeding-edge graph neural networks.
- Simulate complex effects playing out across the graph and see how predictions change.
Zincbase exists to answer questions like "what is the probability that Tom likes LARPing", or "who likes LARPing", or "classify people into LARPers vs normies", or simulations like "what happens if all the LARPers become normies".
It combines the latest in neural networks with symbolic logic (think expert systems and prolog), graph search, and complexity theory.
View full documentation here.
Quickstart
pip3 install zincbase
from zincbase import KB
kb = KB()
kb.store('eats(tom, rice)')
for ans in kb.query('eats(tom, Food)'):
print(ans['Food']) # prints 'rice'
...
# The included assets/countries_s1_train.csv contains triples like:
# (namibia, locatedin, africa)
# (lithuania, neighbor, poland)
kb = KB()
kb.from_csv('./assets/countries_s1_train.csv', delimiter='\t')
kb.build_kg_model(cuda=False, embedding_size=40)
kb.train_kg_model(steps=8000, batch_size=1, verbose=False)
kb.estimate_triple_prob('fiji', 'locatedin', 'melanesia')
0.9607
Requirements
- Python 3
- Libraries from requirements.txt
- GPU preferable for large graphs but not required
Installation
pip install -r requirements.txt
Note: Requirements might differ for PyTorch depending on your system.
Web UI
Zincbase can serve live-updating force-directed graphs in 3D to a web browser. The command
python -m zincbase.web
will set up a static file server and a websocket
server for live updates. Visit http://localhost:5000/
in your browser
and you'll see the graph UI. As you build a graph in Python, you can
visualize it (and changes to it) in realtime through this UI.
Here are a couple of examples (source code here):
Complexity (Graph/Network) Examples
Two such examples are included (right now; we intend to include more soon such as virus spread and neural nets that communicate.) The examples are basic ones: Conway's Game of Life and the Abelian Sandpile. Here are some screencaps; source code is here, performance can be lightning fast depending how you tweak Zincbase recursion and propagation settings.
Required for the UI
- You should
pip install zincbase[web]
to get the optional web extra. - You should have Redis running; by default, at
localhost:6379
. This is easily achievable, just dodocker run -p 6379:6379 -d redis
Testing
python test/test_main.py
python test/test_graph.py
... etc ... all the test files there
python -m doctest zincbase/zincbase.py
Validation
"Countries" and "FB15k" datasets are included in this repo.
There is a script to evaluate that ZincBase gets at least as good performance on the Countries dataset as the original (2019) RotatE paper. From the repo's root directory:
python examples/eval_countries_s3.py
It tests the hardest Countries task and prints out the AUC ROC, which should be ~ 0.95 to match the paper. It takes about 30 minutes to run on a modern GPU.
There is also a script to evaluate performance on FB15k: python examples/fb15k_mrr.py
.
Running the web UI
There are a couple of extra requirements -- install with pip3 install zincbase[web]
.
You also need an accessible Redis instance somewhere. This one-liner will get it running
locally: docker run -p 6379:6379 -d redis
(requires Docker, of course.)
You then need a Zincbase server instance running:
Building documentation
From docs/ dir: make html
. If something changed a lot: sphinx-apidoc -o . ..
Pushing to pypi
NOTE: This is now all automatic via CircleCI, but here are the manual steps for reference:
- Edit
setup.py
as appropriate (probably not necessary) - Edit the version in
zincbase/__init__.py
- From the top project directory
python setup.py sdist bdist_wheel --universal
twine upload dist/*
TODO
- Query all edges by attribute
- to_csv method
- utilize postgres as backend triple store
- The to_csv/from_csv methods do not yet support node attributes.
- Reinforcement learning for graph traversal.
References & Acknowledgements
L334: Computational Syntax and Semantics -- Introduction to Prolog, Steve Harlow
Open Book Project: Prolog in Python, Chris Meyers
Prolog Interpreter in Javascript
Citing
If you use this software, please consider citing:
@software{zincbase,
author = {{Tom Grek}},
title = {ZincBase: A state of the art knowledge base},
url = {https://github.com/tomgrek/zincbase},
version = {0.1.1},
date = {2019-05-12}
}
Contributing
See CONTRIBUTING. And please do!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for zincbase-0.10.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 74ff127cad558267d74aca2a040b2133e58ce3b4007dcc0df196f28e2fe4ac1e |
|
MD5 | 73ba17698ccb3e571a44093bb632eb10 |
|
BLAKE2b-256 | 32090f584873852f3afe4dd8a95a4ef07285b00b8dcd353d6e862c07c0961998 |