Skip to main content

Community detection via Louvain/Leiden + Genetic Algorithm

Project description

TAU Community Detection

PyPI License: MIT Python 3.10+ Downloads Build Status Code style: black

tau-community-detection implements TAU, an evolutionary community detection algorithm that couples genetic search with Leiden refinements. It is designed for scalable graph clustering with a simple drop-in run_clustering() API, sensible defaults, and multiprocessing support.


Highlights

  • Evolutionary search: Maintains a population of candidate partitions and applies crossover/mutation tailored for graph clustering.
  • Leiden optimization: Refines every candidate with Leiden to ensure modularity gains.
  • Multiprocessing aware: Utilises worker pools for population optimization.
  • Deterministic options: Accepts a user-specified random seed for reproducibility.
  • Simple API: Use tau.run_clustering(graph) for the default workflow, or drop down to TauClustering and TauConfig when you need advanced control.

Installation

The project targets Python 3.10 or newer.

pip install tau-community-detection

To work from a clone, install the package in editable mode inside a virtual environment:

git clone https://github.com/HillelCharbit/community_TAU.git
cd community_TAU
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
pip install -e .

Quick Start (Python API)

import networkx as nx
import tau_community_detection as tau

g = nx.erdos_renyi_graph(n=1000, p=0.01, seed=42)

# Zero-friction default usage
clustering = tau.run_clustering(g)
print(f"Modularity: {clustering.modularity:.4f}")
print(f"Communities: {len(clustering)}")

# Only override the knobs you care about
clustering = tau.run_clustering(
  g,
  resolution_parameter=0.8,
  random_seed=42,
  verbose=True,
  population_size=100,
  max_generations=50,
)

run_clustering() returns an igraph.VertexClustering object, so the usual modularity and membership attributes are available immediately.

For advanced tuning, pass additional TauConfig fields directly as keyword arguments without having to instantiate TauConfig yourself:

clustering = tau.run_clustering(
  g,
  stopping_generations=5,
  stopping_jaccard=0.95,
  elite_fraction=0.2,
)

Graph input

To optimize for very large graphs or when using many worker processes, it is recommended to pass a file path (e.g., to an .graph, .ncol, or .edgelist file) directly to run_clustering() or TauClustering rather than a pre-loaded graph object. This allows efficient memory sharing.

Supported input:

  • File path to a graph in common NetworkX or igraph format (auto-detects weighting and structure).
  • Already-loaded networkx.Graph or igraph.Graph objects.

By default, the loader auto-detects whether the graph is weighted based on the file or graph structure. You can override this by setting TauConfig(is_weighted=True/False) when constructing TauClustering, or by passing the appropriate weight settings into run_clustering().

See the Quick Start section above for usage examples.


Configuration

All algorithm hyper-parameters live on the TauConfig dataclass. The high-level run_clustering() wrapper accepts the most common ones directly, while TauConfig remains available for advanced workflows. Key fields include:

  • worker_count: number of parallel processes (defaults to CPU count, capped by population size).
  • population_size: number of partitions maintained per generation (default: 60 in run_clustering()).
  • max_generations: upper bound on evolutionary iterations (default: 20 in run_clustering()).
  • verbose: set to True for progress logging (default: False).
  • stopping_generations / stopping_jaccard: convergence checks based on membership stability.
  • random_seed: makes runs reproducible across processes.

See src/tau_community_detection/config.py for the complete list.


Development

pip install -r requirements-dev.txt
make lint
make test

To build local distributions:

make build

Continuous Integration

  • GitHub Actions run lint, tests, and package builds on pushes and pull requests.
  • Set the CODECOV_TOKEN secret to upload coverage reports.

Publishing

  1. Bump the version in setup.cfg/pyproject.toml and commit.
  2. Tag the release with git tag vX.Y.Z && git push --tags.
  3. Run the Publish Package workflow (defaults to TestPyPI). For PyPI, supply the pypi input and ensure PYPI_API_TOKEN is set. Use TEST_PYPI_API_TOKEN for dry runs.

Reference & Citation

If you use TAU in your research, please cite the original algorithm paper:

From Leiden to Tel-Aviv University (TAU): exploring clustering solutions via a genetic algorithm Gal Gilad and Roded Sharan. PNAS Nexus, Volume 2, Issue 6, June 2023. DOI: 10.1093/pnasnexus/pgad180

BibTeX:

@article{gilad2023tau,
  title={From Leiden to Tel-Aviv University (TAU): exploring clustering solutions via a genetic algorithm},
  author={Gilad, Gal and Sharan, Roded},
  journal={PNAS Nexus},
  volume={2},
  number={6},
  pages={pgad180},
  year={2023},
  publisher={Oxford University Press}
}

License & Versioning

Current Version: 1.3.1 License: This project is licensed under the MIT License.

See the Changelog for a detailed history of changes and updates.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tau_community_detection-1.3.1.tar.gz (25.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tau_community_detection-1.3.1-py3-none-any.whl (19.4 kB view details)

Uploaded Python 3

File details

Details for the file tau_community_detection-1.3.1.tar.gz.

File metadata

  • Download URL: tau_community_detection-1.3.1.tar.gz
  • Upload date:
  • Size: 25.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tau_community_detection-1.3.1.tar.gz
Algorithm Hash digest
SHA256 4d28c2234cb3d8382b787f561f1d860cb2f129ce4190e8af5bff9f5abbd3abec
MD5 b187acdb114bf1b3511c5e3d26cd620b
BLAKE2b-256 59f211058c1a20628d227cda1d0851455659f4ba0ae0f1d3a14dd0774f26cdf0

See more details on using hashes here.

Provenance

The following attestation bundles were made for tau_community_detection-1.3.1.tar.gz:

Publisher: publish.yml on HillelCharbit/TAU

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tau_community_detection-1.3.1-py3-none-any.whl.

File metadata

File hashes

Hashes for tau_community_detection-1.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0d055f8fbd2b953d4ef95deaf40e1302b4df5bb846b3a604002ec2c4b5057d6f
MD5 849ad9b638e55f274f065ac7f5031a00
BLAKE2b-256 3058abe5bc612fb3b5ecd0a0cd05d0ab9085e90870d798c0d90b47fbf3dc7451

See more details on using hashes here.

Provenance

The following attestation bundles were made for tau_community_detection-1.3.1-py3-none-any.whl:

Publisher: publish.yml on HillelCharbit/TAU

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page