Skip to main content

Web application for building, clustering, and visualizing knowledge graphs from scientific text.

Project description

SciKGraph

A web application that helps researchers explore and understand scientific fields by building, clustering, and visualizing knowledge graphs from corpora of scientific text.

Documents are semantically annotated through the Babelfy API, the resulting concepts are connected by co-occurrence, and the graph is clustered with the OClustR algorithm. Visualization happens entirely in the browser via Cytoscape.js.

Requirements

Installation

From PyPI

python -m venv scikgraph_env
source scikgraph_env/bin/activate
pip install scikgraph

From source

git clone https://github.com/your-org/SciKGraph.git
cd SciKGraph
python -m venv scikgraph_env
source scikgraph_env/bin/activate
pip install -e .

pip install -e . installs the package in editable mode using the metadata in pyproject.toml and the dependencies declared there (no need for a separate pip install -r requirements.txt).

Running the application

After either install path, with the virtualenv activated, run:

scikgraph

This starts the app on http://localhost:8080/. Use --host and --port to override the defaults, e.g. scikgraph --port 9000.

Equivalent commands, if you prefer them:

python -m scikgraph
waitress-serve --call 'scikgraph:create_app'

Using the web app

The app is organised around three pages:

  • /create — upload .txt documents, paste your Babelfy key, and click Construct Graph. Once a graph exists you can iterate via Pre-process (vertex/edge thresholds) and Cluster Graph (OClustR).
  • /analyze — modularity, key-concept and key-phrase extraction, cluster reduction, and the cluster-relation graph view.
  • /evolution — load two saved .sckg sessions to compute their cover similarity (oNMI) and visualise overlapping clusters.

Visualization settings

  • Layout — choose one of: spring, cose, concentric, circle, grid.
  • Render graph — turn the visualization on or off.
  • Update visualization setting — apply the current settings.

The Construct Graph form has its own Render after submit checkbox that controls whether the visualization is updated as part of that specific action.

Native dependency: oNMI

static/onmi is a precompiled Linux x86-64 build of Aaron McDaid's Overlapping NMI tool, used by the Covers Similarity feature on the Track Evolution page to compare two cluster covers.

  • It is not a Python package and cannot be installed via pip.
  • It only runs on Linux x86-64. On other platforms, recompile the binary from the upstream source and replace the file, otherwise the Covers Similarity button will fail.

Architecture

License

This project (the SciKGraph code, templates, and documentation) is distributed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International license (CC BY-NC-ND 4.0). You may share it with attribution for non-commercial purposes only and may not distribute modified versions. See https://creativecommons.org/licenses/by-nc-nd/4.0/ for the full terms.

Third-Party Notices

This repository bundles the following third-party libraries inside static/vendor.js:

  • Cytoscape.js 2.4.6 — Copyright © The Cytoscape Consortium. Licensed under the MIT License.
  • jQuery — Copyright © JS Foundation and other contributors. Licensed under the MIT License.

The MIT license requires that its copyright and permission notice be included when the software is redistributed. They are reproduced via the references above and apply to the bundled minified code in static/vendor.js.

static/onmi is a precompiled binary of Overlapping NMI by Aaron McDaid, distributed under its upstream license. It is invoked as a separate process and not linked into SciKGraph at the source level.

Citing SciKGraph

If you use SciKGraph in academic work, please cite both:

@article{scikgraphApp-Tosi,
  author  = {Mauro Dalle Lucca Tosi and Julio Cesar dos Reis},
  title   = {Understanding the evolution of a scientific field by clustering and visualizing knowledge graphs},
  journal = {Journal of Information Science},
  year    = {2020},
  doi     = {10.1177/0165551520937915},
  url     = {https://doi.org/10.1177/0165551520937915}
}

@article{tosi2021scikgraph,
  title     = {SciKGraph: A knowledge graph approach to structure a scientific field},
  author    = {Tosi, Mauro Dalle Lucca and dos Reis, Julio Cesar},
  journal   = {Journal of Informetrics},
  volume    = {15},
  number    = {1},
  pages     = {101109},
  year      = {2021},
  publisher = {Elsevier},
  doi       = {10.1016/j.joi.2020.101109},
  url       = {https://doi.org/10.1016/j.joi.2020.101109}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scikgraph-2.0.1.tar.gz (640.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scikgraph-2.0.1-py3-none-any.whl (648.1 kB view details)

Uploaded Python 3

File details

Details for the file scikgraph-2.0.1.tar.gz.

File metadata

  • Download URL: scikgraph-2.0.1.tar.gz
  • Upload date:
  • Size: 640.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for scikgraph-2.0.1.tar.gz
Algorithm Hash digest
SHA256 16691e5168d547642ee95940c34b821068c44df9249297e5a7689de6f1726c63
MD5 2199781545d8d8851ed3523c65140e34
BLAKE2b-256 b150e986ed72ff3dea8bad68033b76da523f44cd182a19090f2e296bb1f276b7

See more details on using hashes here.

File details

Details for the file scikgraph-2.0.1-py3-none-any.whl.

File metadata

  • Download URL: scikgraph-2.0.1-py3-none-any.whl
  • Upload date:
  • Size: 648.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for scikgraph-2.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b1a64c343177e76d93315047ad19ba93ef46f417b2bc0aca064a4948ef31b650
MD5 6539ad98ea9c4b4c06c42a927ade10d5
BLAKE2b-256 d941196e79ca89b69189b873f653cbb32b6df292aa764642a69508de0926ad11

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page