Web application for building, clustering, and visualizing knowledge graphs from scientific text.
Project description
SciKGraph
A web application that helps researchers explore and understand scientific fields by building, clustering, and visualizing knowledge graphs from corpora of scientific text.
Documents are semantically annotated through the Babelfy API, the resulting concepts are connected by co-occurrence, and the graph is clustered with the OClustR algorithm. Visualization happens entirely in the browser via Cytoscape.js.
Requirements
- Python 3.10+
- Linux x86-64 — required by the bundled
static/onmibinary used for cluster comparison (see Native dependency: oNMI) - A free Babelfy API key (used during graph construction)
Installation
From PyPI
python -m venv scikgraph_env
source scikgraph_env/bin/activate
pip install scikgraph
From source
git clone https://github.com/your-org/SciKGraph.git
cd SciKGraph
python -m venv scikgraph_env
source scikgraph_env/bin/activate
pip install -e .
pip install -e . installs the package in editable mode using the metadata
in pyproject.toml and the dependencies declared there
(no need for a separate pip install -r requirements.txt).
Running the application
After either install path, with the virtualenv activated, run:
scikgraph
This starts the app on http://localhost:8080/. Use --host and --port
to override the defaults, e.g. scikgraph --port 9000.
Equivalent commands, if you prefer them:
python -m scikgraph
waitress-serve --call 'scikgraph:create_app'
Using the web app
The app is organised around three pages:
/create— upload.txtdocuments, paste your Babelfy key, and click Construct Graph. Once a graph exists you can iterate via Pre-process (vertex/edge thresholds) and Cluster Graph (OClustR)./analyze— modularity, key-concept and key-phrase extraction, cluster reduction, and the cluster-relation graph view./evolution— load two saved.sckgsessions to compute their cover similarity (oNMI) and visualise overlapping clusters.
Visualization settings
- Layout — choose one of:
spring,cose,concentric,circle,grid. - Render graph — turn the visualization on or off.
- Update visualization setting — apply the current settings.
The Construct Graph form has its own Render after submit checkbox that controls whether the visualization is updated as part of that specific action.
Native dependency: oNMI
static/onmi is a precompiled Linux x86-64 build of
Aaron McDaid's Overlapping NMI tool,
used by the Covers Similarity feature on the Track Evolution page to compare
two cluster covers.
- It is not a Python package and cannot be installed via pip.
- It only runs on Linux x86-64. On other platforms, recompile the binary from the upstream source and replace the file, otherwise the Covers Similarity button will fail.
Architecture
- Backend: Flask application factory in
__init__.py. - Graph construction:
SciKGraph.py— text parsing, Babelfy disambiguation, co-occurrence graph building, save/open.sckg. - Clustering:
OClustR.py— the OClustR overlapping clustering algorithm. - Analyses:
Analyses.py— modularity, key-phrase extraction (NLTK), cover comparison (oNMI). - Frontend: vanilla JS in
static/main.jsandstatic/render-panel.js; per-render assetsstatic/networks.jsandstatic/styles.jsare regenerated on every render and should not be hand-edited.
License
This project (the SciKGraph code, templates, and documentation) is distributed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International license (CC BY-NC-ND 4.0). You may share it with attribution for non-commercial purposes only and may not distribute modified versions. See https://creativecommons.org/licenses/by-nc-nd/4.0/ for the full terms.
Third-Party Notices
This repository bundles the following third-party libraries inside
static/vendor.js:
- Cytoscape.js 2.4.6 — Copyright © The Cytoscape Consortium. Licensed under the MIT License.
- jQuery — Copyright © JS Foundation and other contributors. Licensed under the MIT License.
The MIT license requires that its copyright and permission notice be included
when the software is redistributed. They are reproduced via the references
above and apply to the bundled minified code in static/vendor.js.
static/onmi is a precompiled binary of
Overlapping NMI by
Aaron McDaid, distributed under its upstream license. It is invoked as a
separate process and not linked into SciKGraph at the source level.
Citing SciKGraph
If you use SciKGraph in academic work, please cite both:
@article{scikgraphApp-Tosi,
author = {Mauro Dalle Lucca Tosi and Julio Cesar dos Reis},
title = {Understanding the evolution of a scientific field by clustering and visualizing knowledge graphs},
journal = {Journal of Information Science},
year = {2020},
doi = {10.1177/0165551520937915},
url = {https://doi.org/10.1177/0165551520937915}
}
@article{tosi2021scikgraph,
title = {SciKGraph: A knowledge graph approach to structure a scientific field},
author = {Tosi, Mauro Dalle Lucca and dos Reis, Julio Cesar},
journal = {Journal of Informetrics},
volume = {15},
number = {1},
pages = {101109},
year = {2021},
publisher = {Elsevier},
doi = {10.1016/j.joi.2020.101109},
url = {https://doi.org/10.1016/j.joi.2020.101109}
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scikgraph-2.0.1.tar.gz.
File metadata
- Download URL: scikgraph-2.0.1.tar.gz
- Upload date:
- Size: 640.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
16691e5168d547642ee95940c34b821068c44df9249297e5a7689de6f1726c63
|
|
| MD5 |
2199781545d8d8851ed3523c65140e34
|
|
| BLAKE2b-256 |
b150e986ed72ff3dea8bad68033b76da523f44cd182a19090f2e296bb1f276b7
|
File details
Details for the file scikgraph-2.0.1-py3-none-any.whl.
File metadata
- Download URL: scikgraph-2.0.1-py3-none-any.whl
- Upload date:
- Size: 648.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b1a64c343177e76d93315047ad19ba93ef46f417b2bc0aca064a4948ef31b650
|
|
| MD5 |
6539ad98ea9c4b4c06c42a927ade10d5
|
|
| BLAKE2b-256 |
d941196e79ca89b69189b873f653cbb32b6df292aa764642a69508de0926ad11
|