Skip to main content

Python library for machine learning on graphs

Project description

StellarGraph Machine Learning library logo

StellarGraph Machine Learning Library

Docker Pulls pypi downloads

Table of Contents

Introduction

StellarGraph is a Python library for machine learning on graph-structured (or equivalently, network-structured) data.

Graph-structured data represent entities, e.g., people, as nodes (or equivalently, vertices), and relationships between entities, e.g., friendship, as links (or equivalently, edges). Nodes and links may have associated attributes such as age, income, and time when a friendship was established, etc. StellarGraph supports analysis of both homogeneous networks (with nodes and links of one type) and heterogeneous networks (with more than one type of nodes and/or links).

The StellarGraph library implements several state-of-the-art algorithms for applying machine learning methods to discover patterns and answer questions using graph-structured data.

The StellarGraph library can be used to solve tasks using graph-structured data, such as:

  • Representation learning for nodes and edges, to be used for visualisation and various downstream machine learning tasks;
  • Classification and attribute inference of nodes or edges;
  • Link prediction;
  • Interpretation of node classification through calculated importances of edges and neighbours for selected nodes [7].

We provide examples of using StellarGraph to solve such tasks using several real-world datasets.

Guiding Principles

StellarGraph uses the Keras library and adheres to the same guiding principles as Keras: user-friendliness, modularity, and easy extendability. Modules and layers of StellarGraph library are designed so that they can be used together with standard Keras layers and modules, if required. This enables flexibility in using existing or creating new models and workflows for machine learning on graphs.

Getting Started

To get started with StellarGraph you'll need data structured as a homogeneous or heterogeneous graph, including attributes for the entities represented as graph nodes. NetworkX is used to represent the graph and Pandas or Numpy are used to store node attributes.

Detailed and narrated examples of various machine learning workflows on network data, supported by StellarGraph, from data ingestion into graph structure to inference, are given in the demos directory of this repository.

Installation

StellarGraph is a Python 3 library and we recommend using Python version 3.6.*. The required Python version can be downloaded and installed from python.org. Alternatively, use the Anaconda Python environment, available from anaconda.com.

Note: while the library works on Python 3.7 it is based on Keras which does not officially support Python 3.7. Therefore, there may be unforseen bugs and you there are many warnings from the Python libraries that StellarGraph depends upon.

The StellarGraph library can be installed in one of two ways, described next.

Install StellarGraph using pip:

To install StellarGraph library from PyPi using pip, execute the following command:

pip install stellargraph

Some of the examples in the demos directory require installing additional dependencies as well as stellargraph. To install these dependencies as well as StellarGraph using pip execute the following command:

pip install stellargraph[demos]

Install StellarGraph from Github source:

First, clone the StellarGraph repository using git:

git clone https://github.com/stellargraph/stellargraph.git

Then, cd to the StellarGraph folder, and install the library by executing the following commands:

cd stellargraph
pip install .

Some of the examples in the demos directory require installing additional dependencies as well as stellargraph. To install these dependencies as well as StellarGraph using pip execute the following command:

pip install .[demos]

Docker Image

Images can be pulled via docker pull stellargraph/stellargraph

Running the examples

See the README in the demos directory for more information about the examples and how to run them.

Algorithms

The StellarGraph library currently includes the following algorithms for graph machine learning:

  • GraphSAGE [1]

    • Supports supervised as well as unsupervised representation learning, node classification/regression, and link prediction for homogeneous networks. The current implementation supports multiple aggregation methods, including mean, maxpool, meanpool, and attentional aggregators.
  • HinSAGE

    • Extension of GraphSAGE algorithm to heterogeneous networks. Supports representation learning, node classification/regression, and link prediction/regression for heterogeneous graphs. The current implementation supports mean aggregation of neighbour nodes, taking into account their types and the types of links between them.
  • The Graph ATtention Network (GAT) [4]

    • The GAT algorithm supports representation learning and node classification for homogeneous graphs. There are versions of the graph attention layer that support both sparse and dense adjacency matrices.
  • Graph Convolutional Network (GCN) [5]

    • The GCN algorithm supports representation learning and node classification for homogeneous graphs. There are versions of the graph convolutional layer that support both sparse and dense adjacency matrices.
  • Simplified Graph Convolutional network (SGC) [6]

    • The SGC network algorithm supports representation learning and node classification for homogeneous graphs. It is an extension of the GCN algorithm that smooths the graph to bring in more distant neighbours of nodes without using multiple layers.
  • Node2Vec [2]

    • The Node2Vec and Deepwalk algorithms perform unsupervised representation learning for homogeneous networks, taking into account network structure while ignoring node attributes. The node2vec algorithm is implemented by combining StellarGraph's random walk generator with the word2vec algorithm from Gensim. Learned node representations can be used in downstream machine learning models implemented using Scikit-learn, Keras, Tensorflow or any other Python machine learning library.
  • Metapath2Vec [3]

    • The metapath2vec algorithm performs unsupervised, metapath-guided representation learning for heterogeneous networks, taking into account network structure while ignoring node attributes. The implementation combines StellarGraph's metapath-guided random walk generator and Gensim word2vec algorithm. As with node2vec, the learned node representations (node embeddings) can be used in downstream machine learning models to solve tasks such as node classification, link prediction, etc, for heterogeneous networks.

Getting Help

Documentation for StellarGraph can be found here.

Discourse Community

Feel free to ask questions and discuss problems on the StellarGraph Discourse forum.

CI

buildkite integration

Pipeline is defined in .buildkite/pipeline.yml

Docker images

  • Tests: Uses the official python:3.6 image.
  • Style: Uses black from the stellargraph docker hub organisation.

Citing

StellarGraph is designed, developed and supported by CSIRO's Data61. If you use any part of this library in your research, please cite it using the following BibTex entry

@misc{StellarGraph,
  author = {CSIRO's Data61},
  title = {StellarGraph Machine Learning Library},
  year = {2018},
  publisher = {GitHub},
  journal = {GitHub Repository},
  howpublished = {\url{https://github.com/stellargraph/stellargraph}},
}

References

  1. Inductive Representation Learning on Large Graphs. W.L. Hamilton, R. Ying, and J. Leskovec. Neural Information Processing Systems (NIPS), 2017. (link webpage)

  2. Node2Vec: Scalable Feature Learning for Networks. A. Grover, J. Leskovec. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2016. (link)

  3. Metapath2Vec: Scalable Representation Learning for Heterogeneous Networks. Yuxiao Dong, Nitesh V. Chawla, and Ananthram Swami. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 135–144, 2017 (link)

  4. Graph Attention Networks. P. Velickovic et al. International Conference on Learning Representations (ICLR) 2018 (link)

  5. Graph Convolutional Networks (GCN): Semi-Supervised Classification with Graph Convolutional Networks. Thomas N. Kipf, Max Welling. International Conference on Learning Representations (ICLR), 2017 (link)

  6. Simplifying Graph Convolutional Networks. F. Wu, T. Zhang, A. H. de Souza, C. Fifty, T. Yu, and K. Q. Weinberger. International Conference on Machine Learning (ICML), 2019. (link)

  7. Adversarial Examples on Graph Data: Deep Insights into Attack and Defense. H. Wu, C. Wang, Y. Tyshetskiy, A. Docherty, K. Lu, and L. Zhu. IJCAI 2019. (link)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stellargraph-0.7.2.tar.gz (103.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stellargraph-0.7.2-py3-none-any.whl (122.6 kB view details)

Uploaded Python 3

File details

Details for the file stellargraph-0.7.2.tar.gz.

File metadata

  • Download URL: stellargraph-0.7.2.tar.gz
  • Upload date:
  • Size: 103.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.9

File hashes

Hashes for stellargraph-0.7.2.tar.gz
Algorithm Hash digest
SHA256 8672be2879c170755600e0468865c72633f2135836d4673e66b17753477ebbed
MD5 bc6c1b0815df9f71a6c6daa0810f4075
BLAKE2b-256 cbaf31c58252018c781056a380bfadf1ae48af05941b49d5d9dc6866c614ff73

See more details on using hashes here.

File details

Details for the file stellargraph-0.7.2-py3-none-any.whl.

File metadata

  • Download URL: stellargraph-0.7.2-py3-none-any.whl
  • Upload date:
  • Size: 122.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.9

File hashes

Hashes for stellargraph-0.7.2-py3-none-any.whl
Algorithm Hash digest
SHA256 3f837f65abc843a69d9da8761663dd94938e9a4ebcff9653d8ccbf49e23106c1
MD5 d349f9ccfc426b8e3fd89cfcfb753f20
BLAKE2b-256 2cb6c7e15a27bf34644649d8d094d29f5a32ed172f861b4c892f154d4de3feef

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page