Skip to main content

Louvain is a general algorithm for methods of community detection in large networks.

Project description

INTRODUCTION
============

This package implements the louvain algorithm in `C++` and exposes it to `python`.
It relies on `(python-)igraph` for it to function. Besides the relative
flexibility of the implementation, it also scales well, and can be run on graphs
of millions of nodes (as long as they can fit in memory). The core function is
``find_partition`` which finds the optimal partition using the louvain algorithm
for a number of different methods. The methods currently implemented are:

* Modularity.
This method compares the actual graph to the expected graph, taking into
account the degree of the nodes [1]. The expected graph is based on a
configuration null-model. Notice that we use the non-normalized version (i.e.
we don't divide by the number of edges), so that this Modularity values
generally does not fall between 0 and 1. The formal definition is

```
H = sum_ij (A_ij - k_i k_j / 2m) d(s_i, s_j),
```

where `A_ij = 1` if there is an edge between node `i` and `j`, `k_i` is the degree of
node `i` and `s_i` is the community of node i.

* RBConfiguration.
This is an extension of modularity which includes a resolution parameter [2].
In general, a higher resolution parameter will lead to smaller communities.
The formal definition is

```
H = sum_ij (A_ij - gamma k_i k_j / 2m) d(s_i, s_j),
```

where `gamma` is the resolution value, and the other variables are the same as
for Modularity.

* RBER.
A variant of the previous method that instead of a configuration null-model
uses a Erdös-Rényi null-model in which each edge has the same probability of
appearing [2]. The formal definition is

```
H = sum_ij (A_ij - gamma p) d(s_i, s_j),
```

where `p` is the density of the graph, and the other variables are the same as
for Modularity, with `gamma` a resolution parameter.


* CPM.
This method compares to a fixed resolution parameter, so that it finds
communities that have an internal density higher than the resolution
parameter, and is separated from other communities with a density lower than
the resolution parameter [3].The formal definition is

```
H = sum_ij (A_ij - gamma ) d(s_i, s_j),
```

with `gamma` a resolution parameter, and the other variables are the same as for
Modularity.

* Significance.
This is a probabilistic method based on the idea of assessing the probability
of finding such dense subgraphs in an (ER) random graph [4]. The formal
definition is

```
H = sum_c M_c D(p_c || p)
```

where `M_c` is the number of possible edges in community `c`, i.e. `n_c (n_c - 1)/2`
for undirected graphs and twice that for directed grahs with `n_c` the size of
community `c`, `p_c` is the density of the community `c`, and `p` the general density
of the graph, and `D(x || y)` is the binary Kullback-Leibler divergence.

* Surprise.
Another probabilistic method, but rather than the probability of finding dense
subgraphs, it focuses on the probability of so many edges within communities
[5, 6]. The formal definition is

```
H = m D(q || <q>)
```

where `m` is the number of edges, `q` is the proportion of edges within
communities (i.e. `sum_c m_c / m`) and `<q>` is the expected proportion of edges
within communities in an Erdős–Rényi graph.

INSTALLATION
============

In short, for Unix: ``sudo pip install louvain``.
For Windows: download the binary installers.

For Unix like systems it is possible to install from source. For Windows this is
overly complicated, and you are recommended to use the binary installation files.
There are two things that are needed by this package: the igraph c core library
and the python-igraph python package. For both, please see http://igraph.org.

There are basically two installation modes, similar to the python-igraph package
itself (from which most of the setup.py comes).

1. No C core library is installed yet. The packages will be compiled and linked
statically to an automatically downloaded version of the C core library of
igraph.
2. A C core library is already installed. In this case, the package will link
dynamically to the already installed version. This is probably also the
version that is used by the igraph package, but you may want to double check
this.

In case the python-igraph package is already installed before, make sure that
both use the **same versions**.

The cleanest setup it to install and compile the C core library yourself (make
sure that the header files are also included, e.g. install also the development
package from igraph). Then both the python-igraph package, as well as this
package are compiled and (dynamically) linked to the same C core library.

TROUBLESHOOTING
===============

In case of any problems, best to start over with a clean environment. Make sure
you remove the python-igraph package completely, remove the C core library and
remove the louvain package. Then, do a complete reinstall starting from ``pip
install louvain``. In case you want a dynamic library be sure to then install
the C core library from source before. Make sure you **install the same
versions**.

USAGE
=====

There is no standalone version of louvain-igraph, and you will always need
python to access it. There are no plans for developing a standalone version or R
support. So, use python. Please refer to the documentation within the python
package for more details on function calls and parameters.

To start, make sure to import the packages:
```python
import louvain
import igraph as ig
```

We'll create a random graph for testing purposes:
```python
G = ig.Graph.Erdos_Renyi(100, 0.1);
```

For simply finding a partition use:
```python
part = louvain.find_partition(G, method='Modularity');
```

In case you want to use a weighted graph, you can store this in an edge
attribute:
```python
G.es['weight'] = 1.0;
part = louvain.find_partition(G, method='Modularity', weight='weight');
```
Please note that not all methods are necessarily capable of handling weighted
graphs.

Notice that ``part`` now contains an additional variable, ``part.quality`` which
stores the quality of the partition as calculated by the used method. You can
always get the quality of the partition using another method by calling
```python
part.significance = louvain.quality(G, partition, method='Significance');
```

You can also find partition for multiplex graphs. For each layer you then
specify the objective function, and the overall objective function is simply the
sum over all layers, weighted by some weight. If we denote by ``q_k`` the quality
of layer ``k`` and the weight by ``w_k``, the overall quality is then ``q = sum_k
w_k q_k``. This can also be useful in case you have negative links. In
principle, this could also be used to detect temporal communities in a dynamic
setting, cf. [8].

For example, assuming you have a graph with positive weights ``G_positive`` and
a graph with negative weights ``G_negative``, and you want to use Modularity for
finding a partition, you can use
```python
membership, quality = louvain.find_partition_multiplex([
louvain.Layer(graph=G_positive, method='Modularity', layer_weight=1.0),
louvain.Layer(graph=G_negative, method='Modularity', layer_weight=-1.0)])
```

Notice the negative layer weight is ``-1.0`` for the negative graph, since we
want those edges to fall between communities rather than within. One particular
problem when using negative links, is that the optimal community is no longer
guaranteed to be connected (it may be a multipartite partition). You may
therefore need the options `consider_comms=ALL_COMMS` to improve the quality of
the partition. Notice that this runs much slower than only considering
neighbouring communities (which is the default).

Various methods (such as Reichardt and Bornholdt's Potts model, or CPM) support
a (linear) resolution parameter, which can be effectively bisected, cf. [5]. You
can do this by calling:
```python
res_parts = louvain.bisect(G, method='CPM', resolution_range=[0,1]);
```
Notice this may take some time to run, as it effectively calls
`louvain.find_partition` for various resolution parameters (depending on the
settings possibly hundreds of times).

Then `res_parts` is a dictionary containing as keys the resolution, and as
values a `NamedTuple` with variables `partition` and `bisect_value`, which
contains the partition and the value at which the resolution was bisected (the
value of the `bisect_func` of the `bisect` function). You could for example plot
the bisection value of all the found partitions by using:
```python
import pandas as pd
import matplotlib.pyplot as plt
res_df = pd.DataFrame({
'resolution': res_parts.keys(),
'bisect_value': [bisect.bisect_value for bisect in res_parts.values()]});
plt.step(res_df['resolution'], res_df['bisect_value']);
plt.xscale('log');
```

REFERENCES
==========

Please cite the references appropriately in case they are used.

1. Blondel, V. D., Guillaume, J.-L., Lambiotte, R. & Lefebvre, E. Fast unfolding
of communities in large networks. J. Stat. Mech. 2008, P10008 (2008).
2. Newman, M. & Girvan, M. Finding and evaluating community structure in networks.
Physical Review E 69, 026113 (2004).
3. Reichardt, J. & Bornholdt, S. Partitioning and modularity of graphs with arbitrary
degree distribution. Physical Review E 76, 015102 (2007).
4. Traag, V. A., Van Dooren, P. & Nesterov, Y. Narrow scope for resolution-limit-free
community detection. Physical Review E 84, 016114 (2011).
5. Traag, V. A., Krings, G. & Van Dooren, P. Significant scales in community structure.
Scientific Reports 3, 2930 (2013).
6. Aldecoa, R. & Marín, I. Surprise maximization reveals the community structure
of complex networks. Scientific reports 3, 1060 (2013).
7. Traag, V.A., Aldecoa, R. & Delvenne, J.-C. Detecting communities using Asymptotical
Surprise. Forthcoming (2015).
8. Mucha, P. J., Richardson, T., Macon, K., Porter, M. A. & Onnela, J.-P.
Community structure in time-dependent, multiscale, and multiplex networks.
Science 328, 876–8 (2010).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

louvain-0.5.3.tar.gz (48.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

louvain-0.5.3.win-amd64-py3.4.msi (134.1 kB view details)

Uploaded Source

louvain-0.5.3.win-amd64-py2.7.msi (423.9 kB view details)

Uploaded Source

louvain-0.5.3.win32-py3.4.msi (123.4 kB view details)

Uploaded Source

louvain-0.5.3.win32-py2.7.msi (393.2 kB view details)

Uploaded Source

louvain-0.5.3-py3.4-linux-x86_64.egg (749.8 kB view details)

Uploaded Egg

louvain-0.5.3-py2.7-linux-x86_64.egg (748.9 kB view details)

Uploaded Egg

File details

Details for the file louvain-0.5.3.tar.gz.

File metadata

  • Download URL: louvain-0.5.3.tar.gz
  • Upload date:
  • Size: 48.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for louvain-0.5.3.tar.gz
Algorithm Hash digest
SHA256 640a6728066df822f1228cfb028e9648c4d516bbdb07a92d3159a810fb975f95
MD5 ae305e69f19d5ab2ccd11510fec52cb4
BLAKE2b-256 610fec441b9d8ccd951dbd1ec0330cafffade67f4ebef6847ff7e5b4d004c4c9

See more details on using hashes here.

File details

Details for the file louvain-0.5.3.win-amd64-py3.4.msi.

File metadata

File hashes

Hashes for louvain-0.5.3.win-amd64-py3.4.msi
Algorithm Hash digest
SHA256 66659d39b9d150a11e613e190218204d106fb3cbab4c60a90c36a87ed3cae969
MD5 fa15984d67c0524e69da0fc0b8c3043b
BLAKE2b-256 9e5d50d8d11cf0d2dc068c209b3516b511f7595b2a6b4a958a44fb9eca6ed9d6

See more details on using hashes here.

File details

Details for the file louvain-0.5.3.win-amd64-py2.7.msi.

File metadata

File hashes

Hashes for louvain-0.5.3.win-amd64-py2.7.msi
Algorithm Hash digest
SHA256 56de476701e73f5ceffb79bd906b97930c3d977f2f7faf67af7e2a98b5d6ddd5
MD5 30337e0f93089a3901cc83e7d3d15f13
BLAKE2b-256 f36e5c5fbe0f6854bfd7afb8fadf17ace335b341621ed78f7cbcd301767e3de9

See more details on using hashes here.

File details

Details for the file louvain-0.5.3.win32-py3.4.msi.

File metadata

File hashes

Hashes for louvain-0.5.3.win32-py3.4.msi
Algorithm Hash digest
SHA256 c0f1a640edd1eae907cba7b2b6965bf7da4cfed925a031c895dc1f86e3c71e4d
MD5 20e66f62960874061a4ee1ac81277bf8
BLAKE2b-256 6f40e34a85fff039c8f1e3e10f0fc6581dae534b3dcc6affe705ccf06f29d62e

See more details on using hashes here.

File details

Details for the file louvain-0.5.3.win32-py2.7.msi.

File metadata

File hashes

Hashes for louvain-0.5.3.win32-py2.7.msi
Algorithm Hash digest
SHA256 1e11ac4471c1b910eaac01747a09ce960c6df10ddd57b9173365402b744e502b
MD5 c5a95e1999d1c7eb7c81018f32f408d0
BLAKE2b-256 390ad32251ad5c9e1b8b2e96ab557fdb00403a28e3fa7430007c302f732bc609

See more details on using hashes here.

File details

Details for the file louvain-0.5.3-py3.4-linux-x86_64.egg.

File metadata

File hashes

Hashes for louvain-0.5.3-py3.4-linux-x86_64.egg
Algorithm Hash digest
SHA256 ffa4d252f191f3c125a20f7c5e25e858a1feb89d6c7082cad77bd821f8892a95
MD5 40fc6b4b1a24adb04462cfbaa332790e
BLAKE2b-256 2bc7edacd4c7662b748a29ca1f131f7f5e95867eea3eb1d1ece2b240c36c9cd6

See more details on using hashes here.

File details

Details for the file louvain-0.5.3-py2.7-linux-x86_64.egg.

File metadata

File hashes

Hashes for louvain-0.5.3-py2.7-linux-x86_64.egg
Algorithm Hash digest
SHA256 87877f0bdb2e48931f1c6f90de867498c5e6f4049f001e31cf1562a8d9035509
MD5 39fb45a1e5359367af9c0893747fa728
BLAKE2b-256 6804d5e6598e2a91ff3bdbbe733185a2596b54401dd2eba9c8ad50798b6b0967

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page