Skip to main content

Web application for building, clustering, and visualizing knowledge graphs from scientific text.

Project description

SciKGraph

A web application that helps researchers explore and understand scientific fields by building, clustering, and visualizing knowledge graphs from corpora of scientific text.

Documents are semantically annotated through the Babelfy API, the resulting concepts are connected by co-occurrence, and the graph is clustered with the OClustR algorithm. Visualization happens entirely in the browser via Cytoscape.js.

Requirements

Installation

From PyPI

python -m venv scikgraph_env
source scikgraph_env/bin/activate
pip install scikgraph

From source

git clone https://github.com/your-org/SciKGraph.git
cd SciKGraph
python -m venv scikgraph_env
source scikgraph_env/bin/activate
pip install -e .

pip install -e . installs the package in editable mode using the metadata in pyproject.toml and the dependencies declared there (no need for a separate pip install -r requirements.txt).

Running the application

After either install path, with the virtualenv activated, run:

scikgraph

This starts the app on http://localhost:8080/. Use --host and --port to override the defaults, e.g. scikgraph --port 9000.

Equivalent commands, if you prefer them:

python -m scikgraph
waitress-serve --call 'scikgraph:create_app'

Using the web app

The app is organised around three pages:

  • /create — upload .txt documents, paste your Babelfy key, and click Construct Graph. Once a graph exists you can iterate via Pre-process (vertex/edge thresholds) and Cluster Graph (OClustR).
  • /analyze — modularity, key-concept and key-phrase extraction, cluster reduction, and the cluster-relation graph view.
  • /evolution — load two saved .sckg sessions to compute their cover similarity (oNMI) and visualise overlapping clusters.

Visualization settings

  • Layout — choose one of: spring, cose, concentric, circle, grid.
  • Render graph — turn the visualization on or off.
  • Update visualization setting — apply the current settings.

The Construct Graph form has its own Render after submit checkbox that controls whether the visualization is updated as part of that specific action.

Native dependency: oNMI

static/onmi is a precompiled Linux x86-64 build of Aaron McDaid's Overlapping NMI tool, used by the Covers Similarity feature on the Track Evolution page to compare two cluster covers.

  • It is not a Python package and cannot be installed via pip.
  • It only runs on Linux x86-64. On other platforms, recompile the binary from the upstream source and replace the file, otherwise the Covers Similarity button will fail.

Architecture

License

This project (the SciKGraph code, templates, and documentation) is distributed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International license (CC BY-NC-ND 4.0). You may share it with attribution for non-commercial purposes only and may not distribute modified versions. See https://creativecommons.org/licenses/by-nc-nd/4.0/ for the full terms.

Third-Party Notices

This repository bundles the following third-party libraries inside static/vendor.js:

  • Cytoscape.js 2.4.6 — Copyright © The Cytoscape Consortium. Licensed under the MIT License.
  • jQuery — Copyright © JS Foundation and other contributors. Licensed under the MIT License.

The MIT license requires that its copyright and permission notice be included when the software is redistributed. They are reproduced via the references above and apply to the bundled minified code in static/vendor.js.

static/onmi is a precompiled binary of Overlapping NMI by Aaron McDaid, distributed under its upstream license. It is invoked as a separate process and not linked into SciKGraph at the source level.

Citing SciKGraph

If you use SciKGraph in academic work, please cite both:

@article{scikgraphApp-Tosi,
  author  = {Mauro Dalle Lucca Tosi and Julio Cesar dos Reis},
  title   = {Understanding the evolution of a scientific field by clustering and visualizing knowledge graphs},
  journal = {Journal of Information Science},
  year    = {2020},
  doi     = {10.1177/0165551520937915},
  url     = {https://doi.org/10.1177/0165551520937915}
}

@article{tosi2021scikgraph,
  title     = {SciKGraph: A knowledge graph approach to structure a scientific field},
  author    = {Tosi, Mauro Dalle Lucca and dos Reis, Julio Cesar},
  journal   = {Journal of Informetrics},
  volume    = {15},
  number    = {1},
  pages     = {101109},
  year      = {2021},
  publisher = {Elsevier},
  doi       = {10.1016/j.joi.2020.101109},
  url       = {https://doi.org/10.1016/j.joi.2020.101109}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scikgraph-2.0.0.tar.gz (640.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scikgraph-2.0.0-py3-none-any.whl (647.9 kB view details)

Uploaded Python 3

File details

Details for the file scikgraph-2.0.0.tar.gz.

File metadata

  • Download URL: scikgraph-2.0.0.tar.gz
  • Upload date:
  • Size: 640.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for scikgraph-2.0.0.tar.gz
Algorithm Hash digest
SHA256 238f924fa091f719a441e699da0559d58b5fc0986c26869c889d399b50a97332
MD5 ae09dc366cea3d044b9a284b78d99aac
BLAKE2b-256 019ec8308d11d563842b4e8fe80005d8dd9f0b4ef7f74ab256ea097eaed66b28

See more details on using hashes here.

File details

Details for the file scikgraph-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: scikgraph-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 647.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for scikgraph-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 03ffd21168bb6d2fc5fb4543d5254f675faf30436696ceb2c9ec66e6b18c7ea6
MD5 dd41f13b9ffd6395e5f12719f7844aac
BLAKE2b-256 d1b5daee50d823afc4d749d4b27cd8d20a45e5d63337c0bfa1e9ce354e23d8de

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page