Skip to main content

a qgcn model package

Project description

QGCN

QGCN method for graph classification: https://arxiv.org/abs/2104.06750

Installation

required packages:

  • scipy~=1.8.0
  • pandas~=1.4.2
  • networkx~=2.8.3
  • numpy~=1.22.3
  • torch~=1.11.0
  • scikit-learn~=1.1.1
  • bokeh~=2.4.2
  • matplotlib~=3.5.1
  • bitstring~=3.1.9
  • python-louvain~=0.16
  • graph-measures~=0.1.44

You can download the package by the command:

pip install QGCN

How to use

Graph representing

To use this package you will need to provide the following files as input:

  • Graphs csv file - files that contain the graphs for input and their labels. The format of the file is flexible, but it must contain headers for any column, and there must be a column provided for:
    • graph id
    • source node id
    • destination node id
    • label id (every graph id can be attached to only one label)
  • External data file - external data for every node (Optional) The format of this file is also flexible, but it must contain headers for any column, and there must be a column provided for: note!! every node must get a value
    • graph id
    • node id
    • column for every external feature (if the value is not numeric then it can be handled with embeddings)

Example for such files:
graph csv file:

g_id,src,dst,label
6678,_1,_2,i
6678,_1,_3,i
6678,_2,_4,i
6678,_3,_5,i

External data file:

g_id,node,charge,chem,symbol,x,y
6678,_1,0,1,C,4.5981,-0.25
6678,_2,0,1,C,5.4641,0.25
6678,_3,0,1,C,3.7321,0.25
6678,_4,0,1,C,6.3301,-0.25

Parameters passing

After creating these file, you should define the parameters of the model. This can be done with a json file, or with data classes:

  • Example json file:
    • (Notice that if an external file is not provided, you should put the associated parameters as None.)
{
    "dataset_name": "Aids",

    "external": {
        "file_path": "./data/AIDS_external_data_all.csv",
        "graph_col": "g_id",
        "node_col": "node",
        "embeddings": ["chem", "symbol"],
        "continuous": ["charge", "x", "y"]
    },

    "graphs_data": {
        "file_path": "./data/AIDS_all.csv",
        "graph_col": "g_id",
        "src_col": "src",
        "dst_col": "dst",
        "label_col": "label",
        "directed": "False",
        "features": ["DEG", "CENTRALITY", "BFS"],
        "adjacency_norm": "NORM_REDUCED",
        "percentage": 1,
        "standardization": "zscore"
    },

    "model": {
        "label_type": "binary",
        "num_classes": 2,
        "use_embeddings": "True",
        "embeddings_dim": [10, 10],
        "activation": "relu_",
        "dropout": 0,
        "lr": 1e-3,
        "optimizer": "ADAM_",
        "L2_regularization": 0,
        "f": "c_x0",
        "GCN_layers": [
            { "in_dim": "None", "out_dim": 100 },
            { "in_dim": 100, "out_dim": 50 },
            { "in_dim": 50, "out_dim": 25 }
        ]
    },

    "activator": {
        "epochs": 3,
        "batch_size": 128,
        "loss_func": "binary_cross_entropy_with_logits_",
        "train": 0.3467,
        "dev": 0.1153,
        "test": 0.538
    }
}
  • Example dataclass objects:
from QGCN.params import GraphsDataParams, ExternalParams, ModelParams, ActivatorParams 

external_params = ExternalParams(file_path="./data/Mutagenicity_external_data_all.csv",
                          embeddings=["chem"],
                          continuous=[])

graphs_data_params = GraphsDataParams(file_path="../src/QGCN/data/Mutagenicity_all.csv",
                               standardization="min_max")

model_params = ModelParams(label_type="binary",
                    use_embeddings="True",
                    embeddings_dim=[10],
                    activation="srss_",
                    GCN_layers=[
                        {"in_dim": "None", "out_dim": 250},
                        {"in_dim": 250, "out_dim": 100}])

activator_params = ActivatorParams(epochs=100)

Executing the model

Once you have these files, you can use the QGCNModel from QGCN.activator with the path to the parameters file or the dataclass objects:

from torch.utils.data import DataLoader
from QGCN.params import GraphsDataParams, ExternalParams, ModelParams, ActivatorParams 
from QGCN.activator import QGCNModel, QGCNDataSet

# sets the parameters of the dataset:
external = ExternalParams(file_path="./data/Mutagenicity_external_data_all.csv",
                          graph_col="g_id", node_col="node",
                          embeddings=["chem"], continuous=[])
graphs_data = GraphsDataParams(file_path="../src/QGCN/data/Mutagenicity_all.csv",
                               standardization="min_max")

# sets the parameters of the model:
model = ModelParams(label_type="binary", num_classes=2, use_embeddings="True", embeddings_dim=[10],
                    activation="srss_", dropout=0.2, lr=0.005, optimizer="ADAM_", L2_regularization=0.005, f="x1_x0",
                    GCN_layers=[
                        {"in_dim": "None", "out_dim": 250},
                        {"in_dim": 250, "out_dim": 100}])
activator = ActivatorParams(epochs=100)

qgcn_model = QGCNModel("Mutagen", graphs_data, external, model, activator)
qgcn_model.train(should_print=False)

ds = QGCNDataSet("Mutagen", graphs_data, external)
loader = DataLoader(
    ds.get_dataset(),
    shuffle=False
)

for _, (A, x0, embed, label) in enumerate(loader):
    output = qgcn_model.predict(A, x0, embed)
    print(output, label)

Links

The datasets can be download here: https://ls11-www.cs.tu-dortmund.de/staff/morris/graphkerneldatasets . Notice you will have to change their format to ours. You can see an example data here (gitHub link) the conventor in datasets -> change_data_format.py Mail address for more information: 123shovalf@gmail.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

QGCN-0.0.12.tar.gz (29.3 kB view hashes)

Uploaded Source

Built Distribution

QGCN-0.0.12-py3-none-any.whl (33.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page