Benchmark of Graph Clustering.
Project description
EAGLEGraphClustering
A benchmark of graph clustering from EAGLE-Lab, Zhejiang University.
Installation
- python>=3.8
- torch>=1.12
- dgl>=1.1
$ python -m pip install egc
Usage
Pip Package
See egc for docs.
- Import the package and use any graph clustering model supported as:
from torch import nn
from egc.model import DGL_GAEKmeans
from egc.utils import load_data
from egc.utils import get_default_args
from egc.utils import set_seed
from egc.utils import set_device
# set the random seed
set_seed(4096)
# set the gpu id
set_device('0')
# load graph
graph, label, n_clusters = load_data(
dataset_name='Cora',
directory='./data',
)
features = graph.ndata["feat"]
adj_csr = graph.adj_external(scipy_fmt="csr")
# get default args
args = get_default_args('gae_kmeans')
# init the model
model = DGL_GAEKmeans(
epochs=args["epochs"],
n_clusters=10,
fead_dim=features.shape[1],
n_nodes=features.shape[0],
hidden_dim1=args["hidden1"],
dropout=args["dropout"],
lr=args["lr"],
early_stop=args["early_stopping_epoch"],
activation=args["activation"],
)
# fit the model
model.fit(adj_csr, features)
# get clustering results
res = model.get_memberships()
Command line
- Clone the Repo
- Install the env
# NOTE: python>=3.8 is needed
# Install cuda if necessary. Check Cuda version first.
$ cd EGC
# Leave out `bash .ci/install-dev.sh &&` if no dev env is needed.
$ bash .ci/install-dev.sh && bash .ci/install.sh
# run `source .env/bin/activate` to activate the virtual env
- Run any supported model as:
$ python train.py ${OPTIONAL global args} ${POSITIONAL args (model)} ${optional model args}
- OPTIONAL global args which should be used before
${model} - POSITIONAL args, i.e., all models supported
E.g.,
# check OPTIONAL global args, e.g., all models supported
$ python train.py -h
# check optional args of certain model
$ python train.py gae_kmeans -h
# run any model
$ python train.py --dataset=Cora gae_kmeans --lr 0.001
Datasets
Cora,Citeseer,Pubmedcome from DGL LibBlogCatalog,Flickrcome from CoLA githubACMcome from SDCN github- All above datasets are converted to undirected graphs.
| Dataset | Nodes | Edges | Attributes | Classes |
|---|---|---|---|---|
| Cora | 2,708 | 10,556 | 1,433 | 7 |
| Citeseer | 3,327 | 9,228 | 3,703 | 6 |
| Pubmed | 19,717 | 88,651 | 500 | 3 |
| BlogCatalog | 5,196 | 343,486 | 8,189 | 6 |
| Flickr | 7,575 | 479,476 | 12,047 | 9 |
| ACM | 3,025 | 26,256 | 1,870 | 3 |
| CoraFull | 19,793 | 126,842 | 8,710 | 70 |
Implemented baseline methods
Disjoint
Unsupervised Graph Neural Networks + Kmeans
| method | Conf/Journal | Original Code | Supproted |
|---|---|---|---|
| VGAE | 16nips | TensorFlow | ✅ |
| GraphSAGE | |||
| DGI | 19iclr | Pytorch | ✅ |
| GMI | 20www | Pytorch | ✅ |
| SENet | 21nn | ✅ |
End-to-End Graph Clustering
| method | Conf/Journal | Original Code | Supproted |
|---|---|---|---|
| SDCN | 20www | Pytorch | ✅ |
| DANMF | 18cikm | code | ✅ |
| M-NMF | 17aaai | Matlab | ✅ |
| DFCN | 21aaai | Pytorch | ✅ |
| VGAECD | 18icdm | -- | ✅ |
| ComE | 17cikm | code | ✅ |
| DAEGC | 19ijcai | code | ✅ |
Overlapping
| method | Conf/Journal | Original Code | Supproted |
|---|---|---|---|
| CommunityGAN | 19www | TensorFlow | ✅ |
Requirements
See dependencies, requirements-dev.txt and requirements.txt.
Contributing
See CONTRIBUTING.md.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
egc-0.3.0.tar.gz
(3.6 MB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
egc-0.3.0-py3-none-any.whl
(208.4 kB
view details)
File details
Details for the file egc-0.3.0.tar.gz.
File metadata
- Download URL: egc-0.3.0.tar.gz
- Upload date:
- Size: 3.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e5c1993bcc65295de978ad893ab48360b38b64dcdb43cf0445e0b3e397d1e632
|
|
| MD5 |
39bd01303a562f7890fbc052ba0a63e6
|
|
| BLAKE2b-256 |
14cb418c94d814bc4cea8ddf4f495a815f3f7f6fa5f65c604bf3d4d6202fe28a
|
File details
Details for the file egc-0.3.0-py3-none-any.whl.
File metadata
- Download URL: egc-0.3.0-py3-none-any.whl
- Upload date:
- Size: 208.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
60ceb780abb6d937f07f22dd76d626aefd81da8b1e4fd3af3af8bd711d0552ce
|
|
| MD5 |
a3a43f936bd473569b4c869692ce2718
|
|
| BLAKE2b-256 |
3e7a5ab396211d9c80b783d45791234d898e9b1c13a6edb4866071814a299cd1
|