Skip to main content

PyDCD: A Deep Learning-Based Community Detection Software in Python for Large-scale Networks

Project description

PyDCD: A Deep Learning-Based Community Detection Software in Python for Large-scale Networks

DCD (Deep learning-based Community Detection) is designed to apply state-of-the-art deep learning technologies to identify communities for large-scale networks. Compared with existing community detection methods, DCD offers a unified solution for many variations of community detection problems.

DCD logo

DCD provides implementation of 4 community detection algorithms, 1 evaluation, and two types of networked data:

Function Description Input Output
KMeans Clustering baseline method (1) Network node file
Network edge file
K
<node id, community id>
MM Clustering baseline method (2) Network node file
Network edge file
<node id, community id>
DCD DCD Network node file
Network edge file
K
<node id, community id>
DCD+ Variant of GCN with node attributes Network node file with attributes
Network edge file
K
<node id, community id>
Evaluation Evaluate the performance Network node file
Network edge file
Community assignment
performance value
Random network Generate random network datasets Network size
Community size
Probability of edges within communities
Probability of edges between communities
Directed network flag
<node id, community id>
Network node file
Network edge file
Facebook network Import Facebook brand-brand network None Facebook dataset
Citation network Import citation network None Citation dataset

Requirements

Generally, the library is compatible with Python 3.6/3.7.

Installation

From Conda

conda install -c pydcd

From PIP

pip3 install PyDCD

From Source

Before installation, make sure you have conda installed.

git clone https://github.com/kpzhang/deepcommunitydetection
cd deepcommunitydetection
conda install -y --file conda/requirements.txt
mkdir build
cd build && cmake .. && make && cd -
cd python && python setup.py install && cd -

Quick Start

Here is a quick-start example.

Python 3.7.3 (default, January 01 2020, 09:00:00) 
[Clang 4.0.1 (tags/RELEASE_401/final)] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.

>>> from pydcd import DCD, KM, MM
>>> kmeans_detector = KM(10)
>>> kmeans_detector.km_detect_community('fb_nodes.txt','fb_edges.txt','N') # N means no evaluation

>>> mm_detector = MM()
>>> mm_detector.mm_detect_community('fb_nodes.txt','fb_edges.txt','Y') # Y means showing evaluation

>>> dcd_detector = DCD() # using default setting for initialization, or
>>> dcd_detector = DCD(128,64,128,50) # set the neurons for three hidden layers and the output dimension
>>> dcd_detector.dcd_detect_community('fb_nodes.txt','fb_edges.txt','Y','N') # Y means nodes having attributes
>>> dcd_detector.dcd_detect_community('fb_nodes.txt','fb_edges.txt','N','N') # The first N means nodes no attributes

>>> rn = RandNet() # to generate random networks
>>> rn.generate_random_networks(1000,100,0.2,0.05) # undirected network with 1000 nodes and 100 communities
>>> rn.generate_random_networks(1000,100,0.2,0.05,directed=True) # directed network with 1000 nodes and 100 communities

Input Examples

node file without attributes:

node_id_1
node_id_2
node_id_3
...
node_id_n

node file with attributes:

node_id_1 <tab> value_for_attribute_1 value_for_attribute_2 ... value_for_attribute_m
node_id_2 <tab> value_for_attribute_1 value_for_attribute_2 ... value_for_attribute_m
node_id_3 <tab> value_for_attribute_1 value_for_attribute_2 ... value_for_attribute_m
...
node_id_n <tab> value_for_attribute_1 value_for_attribute_2 ... value_for_attribute_m

edge file:

node_id_1 node_id_2
...
node_id_i node_id_j
...
node_id_m node_id_k

Development Team

PyDCD is developed by Prof. Kunpeng Zhang, Prof. Shaokun Fan, and Prof. Bruce Golden.

Citation

If you find this useful for your research or development, please cite our work.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for pydcd, version 0.0.15
Filename, size File type Python version Upload date Hashes
Filename, size PyDCD-0.0.15.tar.gz (7.8 MB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page