Skip to main content

Graph convolutional memory for reinforcement learning

Project description

Graph Convolutional Memory using Topological Priors

Description

Graph convolutional memory (GCM) is graph-structured memory that may be applied to reinforcement learning to solve POMDPs, replacing LSTMs or attention mechanisms. GCM allows you to embed your domain knowledge in the form of connections in a graph. See the full paper for further details. This repo contains the GCM library implementation for use in your projects. To replicate the experiments from the paper, please see this repository instead.

If you use GCM, please cite the paper!

@article{morad2021graph,
  title={Graph Convolutional Memory using Topological Priors},
  author={Morad, Steven D and Liwicki, Stephan and Kortvelesy, Ryan and Mecca, Roberto and Prorok, Amanda},
  journal={arXiv preprint arXiv:2106.14117},
  year={2021}
}

Installation

GCM is installed using pip. The dependencies must be installed manually, as they target your specific architecture (with or without CUDA).

Conda install

First install torch >= 1.8.0 and torch-geometric dependencies, then gcm:

conda install torch
conda install pytorch-geometric -c rusty1s -c conda-forge
pip install graph-conv-memory

Pip install

Please follow the torch-geometric install guide, then

pip install graph-conv-memory

Quickstart

Below is a quick example of how to use GCM in a basic RL problem:

import torch
import torch_geometric
from gcm.gcm import DenseGCM
from gcm.edge_selectors.temporal import TemporalBackedge


# graph_size denotes the maximum number of observations in the graph, after which
# the oldest observations will be overwritten with newer observations. Reduce this number to
# reduce memory usage.
graph_size = 128

# Define the GNN used in GCM. The following is the one used in the paper
# Make sure you define the first layer to match your observation space
class GNN(torch.nn.Module):
    """A simple two-layer graph neural network"""
    def __init__(self, obs_size, hidden_size=32):
        super().__init__()
        self.gc0 = torch_geometric.nn.DenseGraphConv(obs_size, hidden_size)
        self.gc1 = torch_geometric.nn.DenseGraphConv(hidden_size, hidden_size)
        self.act = torch.nn.Tanh()

    def forward(self, x, adj, weights, B, N):
        x = self.act(self.gc0(x, adj))
        return self.act(self.gc1(x, adj))

# Build GNN that GCM uses internally
obs_size = 8
gnn = GNN(obs_size)
# Create the GCM using our GNN and edge selection criteria. TemporalBackedge([1]) will link observation o_t to o_{t-1}.
# See `gcm.edge_selectors` for different kinds of priors suitable for your specific problem. Do not be afraid to implement your own!
gcm = DenseGCM(gnn, edge_selectors=TemporalBackedge([1]), graph_size=graph_size)

# If the hidden state m_t is None, GCM will initialize one for you
# only do this at the beginning, as GCM must track and update the hidden
# state to function correctly
#
# You can inspect m_t, as it is just a graph of observations
# the first element is the node feature matrix and the second is the adjacency matrix
m_t = None

for t in train_timestep:
   # Obs at timestep t should be a tensor of shape (batch_size, obs_size)
   # obs = my_env.step()
   belief, m_t = gcm(obs, m_t)
   # GCM provides a belief state -- a combination of all past observational data relevant to the problem
   # What you likely want to do is put this state through actor and critic networks to obtain
   # action and value estimates
   action_logits = logits_nn(belief)
   state_value = vf_nn(belief)

We provide a few edge selectors, which we briefly detail here:

gcm.edge_selectors.temporal.TemporalBackedge
# Connections to the past. Give it [1,2,4] to connect each
# observation t to t-1, t-2, and t-4.

gcm.edge_selectors.dense.DenseEdge
# Connections to all past observations
# observation t is connected to t-1, t-2, ... 0

gcm.edge_selectors.distance.EuclideanEdge
# Connections to observations within some max_distance
# e.g. if l2_norm(o_t, o_k) < max_distance, create an edge

gcm.edge_selectors.distance.CosineEdge
# Like euclidean edge, but using cosine similarity instead

gcm.edge_selectors.distance.SpatialEdge
# Euclidean distance, but only compares slices from the observation
# this is useful if you have an 'x' and 'y' dimension in your observation
# and only want to connect nearby entries
#
# You can also implement the identity priors using this by setting
# max_distance to something like 1e-6

gcm.edge_selectors.learned.LearnedEdge
# Learn an edge function from the data
# Will randomly sample edges and train thru gradient descent
# call the constructor with the output size of your GNN

Ray Quickstart (WIP)

We provide a ray rllib wrapper around GCM, see the example below for how to use it

import unittest
import torch
import torch_geometric
import ray
from ray import tune

from gcm.ray_gcm import RayDenseGCM
from gcm.edge_selectors.temporal import TemporalBackedge

class GNN(torch.nn.Module):
    """A simple two-layer graph neural network"""
    def __init__(self, obs_size, hidden_size=32):
        super().__init__()
        self.gc0 = torch_geometric.nn.DenseGraphConv(obs_size, hidden_size)
        self.gc1 = torch_geometric.nn.DenseGraphConv(hidden_size, hidden_size)
        self.act = torch.nn.Tanh()

    def forward(self, x, adj, weights, B, N):
        x = self.act(self.gc0(x, adj))
        return self.act(self.gc1(x, adj))


ray.init(
    local_mode=True,
    object_store_memory=3e10,
)
input_size = 16 
hidden_size = 32
cfg = {
    "framework": "torch",
    "num_gpus": 0,
    "env": "CartPole-v0",
    "num_workers": 0,
    "model": {
        "custom_model": RayDenseGCM,
        "custom_model_config": {
            "graph_size": 20,
             # GCM Ray wrapper will automatically convert observation
             # to gnn_input_size using a linear layer
            "gnn_input_size": input_size,
            "gnn_output_size": hidden_size,
            "gnn": GNN(input_size),
            "edge_selectors": TemporalBackedge([1]),
            "edge_weights": False,
        }
    }
}
tune.run(
    "A2C",
    config=cfg,
    stop={"info/num_steps_trained": 100}
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graph-conv-memory-0.0.7.tar.gz (20.6 kB view details)

Uploaded Source

Built Distribution

graph_conv_memory-0.0.7-py3-none-any.whl (24.5 kB view details)

Uploaded Python 3

File details

Details for the file graph-conv-memory-0.0.7.tar.gz.

File metadata

  • Download URL: graph-conv-memory-0.0.7.tar.gz
  • Upload date:
  • Size: 20.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.8.12

File hashes

Hashes for graph-conv-memory-0.0.7.tar.gz
Algorithm Hash digest
SHA256 5fe0cc1a56666aebea1fe2c870de558ac53bc5f960fe07580cc17db85b22a113
MD5 1695447cb1af3f45c26ccec36664a9bc
BLAKE2b-256 6283fb8fd24716eb027774b58faf857577c2d68159e1686e602d125c46470c3c

See more details on using hashes here.

File details

Details for the file graph_conv_memory-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: graph_conv_memory-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 24.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.8.12

File hashes

Hashes for graph_conv_memory-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 4956098cd6c82a78902b28e8afed7766c93d1fcc4fb60bbdbec02763126fc552
MD5 3bbbd45a46dcf0d1253b82ded201e545
BLAKE2b-256 5066c8f2369a403a067dad94fc3a17240892aa4b3b21d4abafb02efe07d42612

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page