PyDCD: A Deep Learning-Based Community Detection Software in Python for Large-scale Networks
Project description
PyDCD: A Deep Learning-Based Community Detection Software in Python for Large-scale Networks
DCD (Deep learning-based Community Detection) is designed to apply state-of-the-art deep learning technologies to identify communities for large-scale networks. Compared with existing community detection methods, DCD offers a unified solution for many variations of community detection problems.
DCD provides implementation of 4 community detection algorithms, 1 evaluation, and two types of networked data:
Function | Description | Input | Output |
---|---|---|---|
KMeans | Clustering baseline method (1) | Network node file Network edge file K |
<node id, community id> |
MM | Clustering baseline method (2) | Network node file Network edge file |
<node id, community id> |
DCD | DCD | Network node file Network edge file K |
<node id, community id> |
DCD+ | Variant of GCN with node attributes | Network node file with attributes Network edge file K |
<node id, community id> |
Evaluation | Evaluate the performance | Network node file Network edge file Community assignment |
performance value |
Random network | Generate random network datasets | Network size Community size Probability of edges within communities Probability of edges between communities Directed network flag |
<node id, community id> Network node file Network edge file |
Facebook network | Import Facebook brand-brand network | None | Facebook dataset |
Citation network | Import citation network | None | Citation dataset |
Requirements
Generally, the library is compatible with Python 3.6/3.7.
Installation
From Conda
conda install -c pydcd
From PIP
pip3 install PyDCD
From Source
Before installation, make sure you have conda
installed.
git clone https://github.com/kpzhang/deepcommunitydetection
cd deepcommunitydetection
conda install -y --file conda/requirements.txt
mkdir build
cd build && cmake .. && make && cd -
cd python && python setup.py install && cd -
Quick Start
Here is a quick-start example.
Python 3.7.3 (default, January 01 2020, 09:00:00)
[Clang 4.0.1 (tags/RELEASE_401/final)] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from pydcd import DCD, KM, MM
>>> kmeans_detector = KM(10)
>>> kmeans_detector.km_detect_community('fb_nodes.txt','fb_edges.txt','N') # N means no evaluation
>>> mm_detector = MM()
>>> mm_detector.mm_detect_community('fb_nodes.txt','fb_edges.txt','Y') # Y means showing evaluation
>>> dcd_detector = DCD() # using default setting for initialization, or
>>> dcd_detector = DCD(128,64,128,50) # set the neurons for three hidden layers and the output dimension
>>> dcd_detector.dcd_detect_community('fb_nodes.txt','fb_edges.txt','Y','N') # Y means nodes having attributes
>>> dcd_detector.dcd_detect_community('fb_nodes.txt','fb_edges.txt','N','N') # The first N means nodes no attributes
>>> rn = RandNet() # to generate random networks
>>> rn.generate_random_networks(1000,100,0.2,0.05) # undirected network with 1000 nodes and 100 communities
>>> rn.generate_random_networks(1000,100,0.2,0.05,directed=True) # directed network with 1000 nodes and 100 communities
Input Examples
node file without attributes:
node_id_1
node_id_2
node_id_3
...
node_id_n
node file with attributes:
node_id_1 <tab> value_for_attribute_1 value_for_attribute_2 ... value_for_attribute_m
node_id_2 <tab> value_for_attribute_1 value_for_attribute_2 ... value_for_attribute_m
node_id_3 <tab> value_for_attribute_1 value_for_attribute_2 ... value_for_attribute_m
...
node_id_n <tab> value_for_attribute_1 value_for_attribute_2 ... value_for_attribute_m
edge file:
node_id_1 node_id_2
...
node_id_i node_id_j
...
node_id_m node_id_k
Development Team
PyDCD is developed by Prof. Kunpeng Zhang, Prof. Shaokun Fan, and Prof. Bruce Golden.
Citation
If you find this useful for your research or development, please cite our work.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.