AgglomCluster·PyPI

Performs greedy agglomerative clustering on network-x graphs

These details have not been verified by PyPI

Project links

Project description

Agglomerative clustering tool for network-x graphs

Clustering

Implements the algorithm described by: “Fast algorithm for detecting community structure in networks” M. E. J. Newman. 2004 http://arxiv.org/pdf/cond-mat/0309508v1.pdf This is a greedy agglomerative hierarchical clustering algorithm. The alogorithm efficiently clusters large number of nodes (this is one of the best scaling clustering algorithms) while producing a suggested number of clusters. See papers on scaling and accuracy questions regarding greedy Newman.

This implementation uses a heap to select the best pair to cluster at each iteration - A naive implementation considers all “n” edges in the graph (O(n)) - A heap reduces this search dramatically (O(log(n))

Dependencies

allset – for automatic module importing networkx – supported graphing library

Problems

The actual Modularity score does not exactly match the Modularity score of the example on the wikipedia page
http://en.wikipedia.org/wiki/Modularity_(networks)
Does not work for directed graphs (TODO operate on the undirected graph)
Does not work for negative graphs (TODO add this capability)
Does not handle disconnected components (unless than are components of size 1)
Clustering needs to move to a function call rather than an object holder (return dendrogram object)
Node relabeling is messy
Dendrogram crawling is used for two separate purposes which aren’t clearly defined/called

Attributes

NewmanGreedy objects store the following attributes * Supergraph - Duplicate of the original graph. Gets manipulated during the clustering: edges and nodes are condensed and reweighted - Cluster degree stored in node attribute - Number of connections between clusters stored in edge attribute (weighted edge) - Implicitly keeps track of the existing nodes which is required for the heap * Dendrogram - Stores the clustering history in a tree * Heap - Stores the modularity quality difference for combining each pair of existing nodes - Popping returns the node pair with the largest modulairty/quality difference, then smallest id1, then smallest id2 * Stored as a tuple (value, id1, id2) where the modularity/value is negated, and id1 is the smaller of the two * Processing the smaller IDs first means that smaller clusters will be chosen first during modularity tie breaking - Non-existing nodes are not actively removed from the heap, so pairs with non-existing nodes are ignored when popping * Quality History - Charts the Modularity score for each number of clustering

Author

Author(s): Ethan Lozano & Matthew Seal

Collaborator(s): Zubin Jelveh

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.0.7

May 10, 2018

2.0.6

Mar 4, 2018

2.0.5

Mar 4, 2018

2.0.4

Apr 30, 2017

2.0.3

Apr 30, 2017

2.0.2

Apr 1, 2017

2.0.1

May 17, 2016

2.0.0

Jan 31, 2016

2.0.0-pre-1 pre-release

Jan 31, 2016

2.0.0-pre pre-release

Jan 31, 2016

1.0.3

Aug 24, 2015

This version

1.0.2

Jun 10, 2014

1.0.1

Apr 2, 2014

1.0.0

Apr 1, 2014

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

AgglomCluster-1.0.2.zip (20.7 kB view details)

Uploaded Jun 10, 2014 Source

File details

Details for the file AgglomCluster-1.0.2.zip.

File metadata

Download URL: AgglomCluster-1.0.2.zip
Upload date: Jun 10, 2014
Size: 20.7 kB
Tags: Source
Uploaded using Trusted Publishing? No

File hashes

Hashes for AgglomCluster-1.0.2.zip
Algorithm	Hash digest
SHA256	`93d94a3fd2b7d602ae925dec8ca7f5c838732ea640d94d146738fcf253f63afb`
MD5	`73cf9ad55d3358b09ae64f985115ceaf`
BLAKE2b-256	`e07cc47754f8075537d3995ec33ffab3a41a74edb07a31ad3812b687b5aa4a3d`

See more details on using hashes here.

AgglomCluster 1.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Clustering

Dependencies

Problems

Attributes

Author

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes